python爬虫 - 爬取html中的script数据(股票行情信息 - 雪球网 )

1. 分析页面内容数据格式

  • 打开 https://xueqiu.com/hq/detail?order=desc&orderBy=percent&type=sha&market=CN&first_name=0&second_name=3

  • 按F12(或 在网页上右键 --> 检查(Inspect))

  • 找到网页上的Network(网络)部分

  • 鼠标点击网页页面,按 Ctrl + R 刷新网页页面,可以看到 NetWork(网络)部分会刷新出很多的网络信息

  • 在Network中,选中“放大镜(过滤)”,输入网页上关注的某些信息,如:某个股票的名字

  • 在Name 列,找到 detail 相关的条目,右侧自动显示网页的相关内容:Headers, Preview, Response … …

  • 分析Response内容,所需要关心的内容,位于整个html页面的下面 内容;

在这里插入图片描述

2. 使用re.findall方法,爬取股票行情(返回信息异常)

要点:从 之间的数据都是json数据。 json.loads会自动将false转为False, true转为True


import re
import requests
import json

# URL路径
url = "https://xueqiu.com/hq/detail?order=desc&orderBy=percent&type=sha&market=CN&first_name=0&second_name=3"

response = requests.get(url)
str1 = response.content.decode()
print(f"str1 = [{str1}]")

# 查找,使用正在表达式->取数组的第一个
result = re.findall("<script id=\"initStore\">window.__INITIAL_STORE__ = (.*?)</script>", str1)
print(result)

运行结果:


str1 = [403 Forbidden. Your IP Address: 124.202.215.98 .]
[]

报403错误,是因为一定要加上header才能成功访问(证明你是个浏览器,而不是机器人之类的)。

3. 使用re.findall方法,爬取股票行情(正常)

  • 拷贝其 cURL 信息

    如下所示。选中所需的条目,右键 --> Copy --> Copy as cURL

在这里插入图片描述


curl 'https://xueqiu.com/hq/detail?order=desc&orderBy=percent&type=sha&market=CN&first_name=0&second_name=3' \
  -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7' \
  -H 'Accept-Language: zh-CN,zh;q=0.9' \
  -H 'Cache-Control: max-age=0' \
  -H 'Connection: keep-alive' \
  -H 'Cookie: cookiesu=241712922404752; device_id=6a73424ed3aae5c44aeb59b0ddfbc91b; smidV2=20240412194646e6b74b8abc2e9e66b752832ee3e0ee4800ab44152b05354c0; s=bo11shkdxf; remember=1; xq_a_token=7ef03deb28d3396dc9d555329881fd9986211657; xqat=7ef03deb28d3396dc9d555329881fd9986211657; xq_id_token=eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiJ9.eyJ1aWQiOjE4NzgxNDkwNzEsImlzcyI6InVjIiwiZXhwIjoxNzE1NjYzMDk3LCJjdG0iOjE3MTMxMDA5Mjc2MjYsImNpZCI6ImQ5ZDBuNEFadXAifQ.XUMYCQrcozefxhq-MVz8kB39_b5LC-wfZIEk7wytUPoTufTNNsYGnlxmoaT09V1_jadkKemvEfeDbneSTs6OaEp_aTNjMTN12xSKmUxwqqfpqzjWgZrOsUAwW3ArHYNrbT0llkfZR0nAh36p54Zl2ln-auokNRuEiqkrF-Ivpd8FPxs_b5SVXhbIM1mRgdTGyjWwCiHE9TNa7AzG870_fwimq0HefT88pjEvZSJ2tGdYAgWTYK6rmY_4nrjais4IodjkcmXpP7sFM_-OYN5NanzonuMK9OhbbOBiWNusORBeXoRUSDAyUFdNwc7Vsn5iDOm08FzVp-QmUtRGo2ne-Q; xq_r_token=62e0ff828b86cfa501f1520ad6570a99838e72e5; xq_is_login=1; u=1878149071; Hm_lvt_1db88642e346389874251b5a1eded6e3=1712922406,1713336629; Hm_lpvt_1db88642e346389874251b5a1eded6e3=1713748510; .thumbcache_f24b8bbe5a5934237bbc0eda20c1b6e7=tS0VaWi7obEkSury7ep6QSmK3ixl5usjG6jN/M/opO639mLrQLd0/M5xBK+0SJT5l937mMvmchxm5Vx+uE7pdQ%3D%3D' \
  -H 'Sec-Fetch-Dest: document' \
  -H 'Sec-Fetch-Mode: navigate' \
  -H 'Sec-Fetch-Site: same-origin' \
  -H 'Sec-Fetch-User: ?1' \
  -H 'Upgrade-Insecure-Requests: 1' \
  -H 'User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36' \
  -H 'sec-ch-ua: "Not_A Brand";v="8", "Chromium";v="120", "Google Chrome";v="120"' \
  -H 'sec-ch-ua-mobile: ?0' \
  -H 'sec-ch-ua-platform: "Linux"' \
  --compressed

在这里插入图片描述

转换后信息如下图所示,选择【Copy to clipboard】,并黏贴到Pycharm开发环境中即可直接使用:

转换后信息如下图所示,请关注: header 中的 传输格式为: text/html 。

在这里插入图片描述


import re
import requests
import json


cookies = {
    'cookiesu': '241712922404752',
    'device_id': '6a73424ed3aae5c44aeb59b0ddfbc91b',
    'smidV2': '20240412194646e6b74b8abc2e9e66b752832ee3e0ee4800ab44152b05354c0',
    's': 'bo11shkdxf',
    'remember': '1',
    'xq_a_token': '7ef03deb28d3396dc9d555329881fd9986211657',
    'xqat': '7ef03deb28d3396dc9d555329881fd9986211657',
    'xq_id_token': 'eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiJ9.eyJ1aWQiOjE4NzgxNDkwNzEsImlzcyI6InVjIiwiZXhwIjoxNzE1NjYzMDk3LCJjdG0iOjE3MTMxMDA5Mjc2MjYsImNpZCI6ImQ5ZDBuNEFadXAifQ.XUMYCQrcozefxhq-MVz8kB39_b5LC-wfZIEk7wytUPoTufTNNsYGnlxmoaT09V1_jadkKemvEfeDbneSTs6OaEp_aTNjMTN12xSKmUxwqqfpqzjWgZrOsUAwW3ArHYNrbT0llkfZR0nAh36p54Zl2ln-auokNRuEiqkrF-Ivpd8FPxs_b5SVXhbIM1mRgdTGyjWwCiHE9TNa7AzG870_fwimq0HefT88pjEvZSJ2tGdYAgWTYK6rmY_4nrjais4IodjkcmXpP7sFM_-OYN5NanzonuMK9OhbbOBiWNusORBeXoRUSDAyUFdNwc7Vsn5iDOm08FzVp-QmUtRGo2ne-Q',
    'xq_r_token': '62e0ff828b86cfa501f1520ad6570a99838e72e5',
    'xq_is_login': '1',
    'u': '1878149071',
    'Hm_lvt_1db88642e346389874251b5a1eded6e3': '1712922406,1713336629',
    'acw_tc': '2760827617137484501643918ec1ae269b915224c8ceb33aa4f000172e82d2',
    'is_overseas': '0',
    'Hm_lpvt_1db88642e346389874251b5a1eded6e3': '1713748456',
    '.thumbcache_f24b8bbe5a5934237bbc0eda20c1b6e7': 'QJJzyMG6NouPrj6PoJFuvPJC+4F7scrY2K8/CLAPFi42vzykhTkQXza2jBNUSCvXzZGckUg7p8vUHwJaA/U2xQ%3D%3D',
}

headers = {
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7',
    'Accept-Language': 'zh-CN,zh;q=0.9',
    'Cache-Control': 'max-age=0',
    'Connection': 'keep-alive',
    # 'Cookie': 'cookiesu=241712922404752; device_id=6a73424ed3aae5c44aeb59b0ddfbc91b; smidV2=20240412194646e6b74b8abc2e9e66b752832ee3e0ee4800ab44152b05354c0; s=bo11shkdxf; remember=1; xq_a_token=7ef03deb28d3396dc9d555329881fd9986211657; xqat=7ef03deb28d3396dc9d555329881fd9986211657; xq_id_token=eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiJ9.eyJ1aWQiOjE4NzgxNDkwNzEsImlzcyI6InVjIiwiZXhwIjoxNzE1NjYzMDk3LCJjdG0iOjE3MTMxMDA5Mjc2MjYsImNpZCI6ImQ5ZDBuNEFadXAifQ.XUMYCQrcozefxhq-MVz8kB39_b5LC-wfZIEk7wytUPoTufTNNsYGnlxmoaT09V1_jadkKemvEfeDbneSTs6OaEp_aTNjMTN12xSKmUxwqqfpqzjWgZrOsUAwW3ArHYNrbT0llkfZR0nAh36p54Zl2ln-auokNRuEiqkrF-Ivpd8FPxs_b5SVXhbIM1mRgdTGyjWwCiHE9TNa7AzG870_fwimq0HefT88pjEvZSJ2tGdYAgWTYK6rmY_4nrjais4IodjkcmXpP7sFM_-OYN5NanzonuMK9OhbbOBiWNusORBeXoRUSDAyUFdNwc7Vsn5iDOm08FzVp-QmUtRGo2ne-Q; xq_r_token=62e0ff828b86cfa501f1520ad6570a99838e72e5; xq_is_login=1; u=1878149071; Hm_lvt_1db88642e346389874251b5a1eded6e3=1712922406,1713336629; acw_tc=2760827617137484501643918ec1ae269b915224c8ceb33aa4f000172e82d2; is_overseas=0; Hm_lpvt_1db88642e346389874251b5a1eded6e3=1713748456; .thumbcache_f24b8bbe5a5934237bbc0eda20c1b6e7=QJJzyMG6NouPrj6PoJFuvPJC+4F7scrY2K8/CLAPFi42vzykhTkQXza2jBNUSCvXzZGckUg7p8vUHwJaA/U2xQ%3D%3D',
    'Sec-Fetch-Dest': 'document',
    'Sec-Fetch-Mode': 'navigate',
    'Sec-Fetch-Site': 'same-origin',
    'Sec-Fetch-User': '?1',
    'Upgrade-Insecure-Requests': '1',
    'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
    'sec-ch-ua': '"Not_A Brand";v="8", "Chromium";v="120", "Google Chrome";v="120"',
    'sec-ch-ua-mobile': '?0',
    'sec-ch-ua-platform': '"Linux"',
}

params = {
    'order': 'desc',
    'orderBy': 'percent',
    'type': 'sha',
    'market': 'CN',
    'first_name': '0',
    'second_name': '3',
}

response = requests.get('https://xueqiu.com/hq/detail', params=params, cookies=cookies, headers=headers)
str1 = response.content.decode()

result = re.findall("<script id=\"initStore\">window.__INITIAL_STORE__ = (.*?)</script>", str1)
print(f"result = [{result}]")

json_result = result[0]
json.loads(json_result)
print(f"json_result = [{json_result}]")


# 解析 json 结构数据体
print(f'json_result.isMobile        = {json_result["isMobile"]}')
print(f'json_result.originalUrl     = {json_result["originalUrl"]}')
for item in json_result["initStore"]["tableData"]:
    print(f'symbol = {item["symbol"]}, '
          f'name = {item["name"]}, '
          f'涨跌幅 = {item["percent"]}%, '
          f'当前价 = {item["current"]}, '
          f'涨跌额 = {item["chg"]}, ')


运行结果:
在这里插入图片描述

4. 使用re.search 方法,爬取股票行情(返回信息异常)

要点:从 之间的数据都是json数据。 json.loads会自动将false转为False, true转为True


import requests
import re
import json


cookies = {
    'cookiesu': '241712922404752',
    'device_id': '6a73424ed3aae5c44aeb59b0ddfbc91b',
    'smidV2': '20240412194646e6b74b8abc2e9e66b752832ee3e0ee4800ab44152b05354c0',
    's': 'bo11shkdxf',
    'remember': '1',
    'xq_a_token': '7ef03deb28d3396dc9d555329881fd9986211657',
    'xqat': '7ef03deb28d3396dc9d555329881fd9986211657',
    'xq_id_token': 'eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiJ9.eyJ1aWQiOjE4NzgxNDkwNzEsImlzcyI6InVjIiwiZXhwIjoxNzE1NjYzMDk3LCJjdG0iOjE3MTMxMDA5Mjc2MjYsImNpZCI6ImQ5ZDBuNEFadXAifQ.XUMYCQrcozefxhq-MVz8kB39_b5LC-wfZIEk7wytUPoTufTNNsYGnlxmoaT09V1_jadkKemvEfeDbneSTs6OaEp_aTNjMTN12xSKmUxwqqfpqzjWgZrOsUAwW3ArHYNrbT0llkfZR0nAh36p54Zl2ln-auokNRuEiqkrF-Ivpd8FPxs_b5SVXhbIM1mRgdTGyjWwCiHE9TNa7AzG870_fwimq0HefT88pjEvZSJ2tGdYAgWTYK6rmY_4nrjais4IodjkcmXpP7sFM_-OYN5NanzonuMK9OhbbOBiWNusORBeXoRUSDAyUFdNwc7Vsn5iDOm08FzVp-QmUtRGo2ne-Q',
    'xq_r_token': '62e0ff828b86cfa501f1520ad6570a99838e72e5',
    'xq_is_login': '1',
    'u': '1878149071',
    'Hm_lvt_1db88642e346389874251b5a1eded6e3': '1712922406,1713336629',
    'acw_tc': '2760827617137484501643918ec1ae269b915224c8ceb33aa4f000172e82d2',
    'is_overseas': '0',
    'Hm_lpvt_1db88642e346389874251b5a1eded6e3': '1713748456',
    '.thumbcache_f24b8bbe5a5934237bbc0eda20c1b6e7': 'QJJzyMG6NouPrj6PoJFuvPJC+4F7scrY2K8/CLAPFi42vzykhTkQXza2jBNUSCvXzZGckUg7p8vUHwJaA/U2xQ%3D%3D',
}

headers = {
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7',
    'Accept-Language': 'zh-CN,zh;q=0.9',
    'Cache-Control': 'max-age=0',
    'Connection': 'keep-alive',
    # 'Cookie': 'cookiesu=241712922404752; device_id=6a73424ed3aae5c44aeb59b0ddfbc91b; smidV2=20240412194646e6b74b8abc2e9e66b752832ee3e0ee4800ab44152b05354c0; s=bo11shkdxf; remember=1; xq_a_token=7ef03deb28d3396dc9d555329881fd9986211657; xqat=7ef03deb28d3396dc9d555329881fd9986211657; xq_id_token=eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiJ9.eyJ1aWQiOjE4NzgxNDkwNzEsImlzcyI6InVjIiwiZXhwIjoxNzE1NjYzMDk3LCJjdG0iOjE3MTMxMDA5Mjc2MjYsImNpZCI6ImQ5ZDBuNEFadXAifQ.XUMYCQrcozefxhq-MVz8kB39_b5LC-wfZIEk7wytUPoTufTNNsYGnlxmoaT09V1_jadkKemvEfeDbneSTs6OaEp_aTNjMTN12xSKmUxwqqfpqzjWgZrOsUAwW3ArHYNrbT0llkfZR0nAh36p54Zl2ln-auokNRuEiqkrF-Ivpd8FPxs_b5SVXhbIM1mRgdTGyjWwCiHE9TNa7AzG870_fwimq0HefT88pjEvZSJ2tGdYAgWTYK6rmY_4nrjais4IodjkcmXpP7sFM_-OYN5NanzonuMK9OhbbOBiWNusORBeXoRUSDAyUFdNwc7Vsn5iDOm08FzVp-QmUtRGo2ne-Q; xq_r_token=62e0ff828b86cfa501f1520ad6570a99838e72e5; xq_is_login=1; u=1878149071; Hm_lvt_1db88642e346389874251b5a1eded6e3=1712922406,1713336629; acw_tc=2760827617137484501643918ec1ae269b915224c8ceb33aa4f000172e82d2; is_overseas=0; Hm_lpvt_1db88642e346389874251b5a1eded6e3=1713748456; .thumbcache_f24b8bbe5a5934237bbc0eda20c1b6e7=QJJzyMG6NouPrj6PoJFuvPJC+4F7scrY2K8/CLAPFi42vzykhTkQXza2jBNUSCvXzZGckUg7p8vUHwJaA/U2xQ%3D%3D',
    'Sec-Fetch-Dest': 'document',
    'Sec-Fetch-Mode': 'navigate',
    'Sec-Fetch-Site': 'same-origin',
    'Sec-Fetch-User': '?1',
    'Upgrade-Insecure-Requests': '1',
    'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
    'sec-ch-ua': '"Not_A Brand";v="8", "Chromium";v="120", "Google Chrome";v="120"',
    'sec-ch-ua-mobile': '?0',
    'sec-ch-ua-platform': '"Linux"',
}

params = {
    'order': 'desc',
    'orderBy': 'percent',
    'type': 'sha',
    'market': 'CN',
    'first_name': '0',
    'second_name': '3',
}

response = requests.get('https://xueqiu.com/hq/detail', params=params, cookies=cookies, headers=headers)


html_doc = response.text
data = re.search(r"<script id=\"initStore\">window.__INITIAL_STORE__ = (.*?)</script>", html_doc)

json_data = json.loads(data.group(1))
print(f"json_data = {json_data}")

# pretty print the data:
print(json.dumps(json_data, indent=4))

for item in json_data["initStore"]["tableData"]:
    print(f'symbol = {item["symbol"]}, '
          f'name = {item["name"]}, '
          f'涨跌幅 = {item["percent"]}%, '
          f'当前价 = {item["current"]}, '
          f'涨跌额 = {item["chg"]}, ')

运行结果:

在这里插入图片描述

name 字段显示异常字符。 只需要增加: response.encoding = ‘utf-8’ 即可。

5. 使用re.search 方法,爬取股票行情(正常)

增加: response.encoding = ‘utf-8’ 后代码:


import requests
import re
import json


cookies = {
    'cookiesu': '241712922404752',
    'device_id': '6a73424ed3aae5c44aeb59b0ddfbc91b',
    'smidV2': '20240412194646e6b74b8abc2e9e66b752832ee3e0ee4800ab44152b05354c0',
    's': 'bo11shkdxf',
    'remember': '1',
    'xq_a_token': '7ef03deb28d3396dc9d555329881fd9986211657',
    'xqat': '7ef03deb28d3396dc9d555329881fd9986211657',
    'xq_id_token': 'eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiJ9.eyJ1aWQiOjE4NzgxNDkwNzEsImlzcyI6InVjIiwiZXhwIjoxNzE1NjYzMDk3LCJjdG0iOjE3MTMxMDA5Mjc2MjYsImNpZCI6ImQ5ZDBuNEFadXAifQ.XUMYCQrcozefxhq-MVz8kB39_b5LC-wfZIEk7wytUPoTufTNNsYGnlxmoaT09V1_jadkKemvEfeDbneSTs6OaEp_aTNjMTN12xSKmUxwqqfpqzjWgZrOsUAwW3ArHYNrbT0llkfZR0nAh36p54Zl2ln-auokNRuEiqkrF-Ivpd8FPxs_b5SVXhbIM1mRgdTGyjWwCiHE9TNa7AzG870_fwimq0HefT88pjEvZSJ2tGdYAgWTYK6rmY_4nrjais4IodjkcmXpP7sFM_-OYN5NanzonuMK9OhbbOBiWNusORBeXoRUSDAyUFdNwc7Vsn5iDOm08FzVp-QmUtRGo2ne-Q',
    'xq_r_token': '62e0ff828b86cfa501f1520ad6570a99838e72e5',
    'xq_is_login': '1',
    'u': '1878149071',
    'Hm_lvt_1db88642e346389874251b5a1eded6e3': '1712922406,1713336629',
    'acw_tc': '2760827617137484501643918ec1ae269b915224c8ceb33aa4f000172e82d2',
    'is_overseas': '0',
    'Hm_lpvt_1db88642e346389874251b5a1eded6e3': '1713748456',
    '.thumbcache_f24b8bbe5a5934237bbc0eda20c1b6e7': 'QJJzyMG6NouPrj6PoJFuvPJC+4F7scrY2K8/CLAPFi42vzykhTkQXza2jBNUSCvXzZGckUg7p8vUHwJaA/U2xQ%3D%3D',
}

headers = {
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7',
    'Accept-Language': 'zh-CN,zh;q=0.9',
    'Cache-Control': 'max-age=0',
    'Connection': 'keep-alive',
    # 'Cookie': 'cookiesu=241712922404752; device_id=6a73424ed3aae5c44aeb59b0ddfbc91b; smidV2=20240412194646e6b74b8abc2e9e66b752832ee3e0ee4800ab44152b05354c0; s=bo11shkdxf; remember=1; xq_a_token=7ef03deb28d3396dc9d555329881fd9986211657; xqat=7ef03deb28d3396dc9d555329881fd9986211657; xq_id_token=eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiJ9.eyJ1aWQiOjE4NzgxNDkwNzEsImlzcyI6InVjIiwiZXhwIjoxNzE1NjYzMDk3LCJjdG0iOjE3MTMxMDA5Mjc2MjYsImNpZCI6ImQ5ZDBuNEFadXAifQ.XUMYCQrcozefxhq-MVz8kB39_b5LC-wfZIEk7wytUPoTufTNNsYGnlxmoaT09V1_jadkKemvEfeDbneSTs6OaEp_aTNjMTN12xSKmUxwqqfpqzjWgZrOsUAwW3ArHYNrbT0llkfZR0nAh36p54Zl2ln-auokNRuEiqkrF-Ivpd8FPxs_b5SVXhbIM1mRgdTGyjWwCiHE9TNa7AzG870_fwimq0HefT88pjEvZSJ2tGdYAgWTYK6rmY_4nrjais4IodjkcmXpP7sFM_-OYN5NanzonuMK9OhbbOBiWNusORBeXoRUSDAyUFdNwc7Vsn5iDOm08FzVp-QmUtRGo2ne-Q; xq_r_token=62e0ff828b86cfa501f1520ad6570a99838e72e5; xq_is_login=1; u=1878149071; Hm_lvt_1db88642e346389874251b5a1eded6e3=1712922406,1713336629; acw_tc=2760827617137484501643918ec1ae269b915224c8ceb33aa4f000172e82d2; is_overseas=0; Hm_lpvt_1db88642e346389874251b5a1eded6e3=1713748456; .thumbcache_f24b8bbe5a5934237bbc0eda20c1b6e7=QJJzyMG6NouPrj6PoJFuvPJC+4F7scrY2K8/CLAPFi42vzykhTkQXza2jBNUSCvXzZGckUg7p8vUHwJaA/U2xQ%3D%3D',
    'Sec-Fetch-Dest': 'document',
    'Sec-Fetch-Mode': 'navigate',
    'Sec-Fetch-Site': 'same-origin',
    'Sec-Fetch-User': '?1',
    'Upgrade-Insecure-Requests': '1',
    'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
    'sec-ch-ua': '"Not_A Brand";v="8", "Chromium";v="120", "Google Chrome";v="120"',
    'sec-ch-ua-mobile': '?0',
    'sec-ch-ua-platform': '"Linux"',
}

params = {
    'order': 'desc',
    'orderBy': 'percent',
    'type': 'sha',
    'market': 'CN',
    'first_name': '0',
    'second_name': '3',
}

response = requests.get('https://xueqiu.com/hq/detail', params=params, cookies=cookies, headers=headers)
response.encoding = 'utf-8'

html_doc = response.text
data = re.search(r"<script id=\"initStore\">window.__INITIAL_STORE__ = (.*?)</script>", html_doc)

json_data = json.loads(data.group(1))
print(f"json_data = {json_data}")

# pretty print the data:
print(json.dumps(json_data, indent=4))

for item in json_data["initStore"]["tableData"]:
    print(f'symbol = {item["symbol"]}, '
          f'name = {item["name"]}, '
          f'涨跌幅 = {item["percent"]}%, '
          f'当前价 = {item["current"]}, '
          f'涨跌额 = {item["chg"]}, ')

运行结果:

在这里插入图片描述

最近更新

  1. TCP协议是安全的吗?

    2024-04-25 16:56:01       18 阅读
  2. 阿里云服务器执行yum,一直下载docker-ce-stable失败

    2024-04-25 16:56:01       19 阅读
  3. 【Python教程】压缩PDF文件大小

    2024-04-25 16:56:01       18 阅读
  4. 通过文章id递归查询所有评论(xml)

    2024-04-25 16:56:01       20 阅读

热门阅读

  1. 【WEEK9】学习目标及总结【Spring Boot】【中文版】

    2024-04-25 16:56:01       16 阅读
  2. 后端面试---分布式&&微服务

    2024-04-25 16:56:01       15 阅读
  3. 【CCF推荐-C类】计算机学术会议截稿信息2条

    2024-04-25 16:56:01       16 阅读
  4. Android Binder——数据传输限制(二十三)

    2024-04-25 16:56:01       15 阅读
  5. Hive安装与配置实战指南

    2024-04-25 16:56:01       14 阅读
  6. 自動重啟Debian

    2024-04-25 16:56:01       14 阅读
  7. 国内知名五款大模型

    2024-04-25 16:56:01       15 阅读
  8. PageHelper实现分页查询

    2024-04-25 16:56:01       16 阅读
  9. MongoDB聚合运算符:$setDifference

    2024-04-25 16:56:01       15 阅读