{"id":204972,"date":"2025-05-29T10:11:13","date_gmt":"2025-05-29T02:11:13","guid":{"rendered":"https:\/\/server.hk\/cnblog\/204972\/"},"modified":"2025-05-29T10:11:13","modified_gmt":"2025-05-29T02:11:13","slug":"beautifulsoup%e6%8f%90%e5%8f%96%e5%b8%a6%e5%9b%9e%e8%bd%a6%e7%ac%a6%e7%9a%84%e5%88%97%e8%a1%a8%e5%85%83%e7%b4%a0%e5%a6%82%e4%bd%95%e6%ad%a3%e7%a1%ae%e5%a4%84%e7%90%86%ef%bc%9f","status":"publish","type":"post","link":"https:\/\/server.hk\/cnblog\/204972\/","title":{"rendered":"BeautifulSoup\u63d0\u53d6\u5e26\u56de\u8f66\u7b26\u7684\u5217\u8868\u5143\u7d20\u5982\u4f55\u6b63\u786e\u5904\u7406\uff1f"},"content":{"rendered":"<p><b><\/b>     <\/p>\n<h1>BeautifulSoup\u63d0\u53d6\u5e26\u56de\u8f66\u7b26\u7684\u5217\u8868\u5143\u7d20\u5982\u4f55\u6b63\u786e\u5904\u7406\uff1f<\/h1>\n<p>\u54c8\u55bd\uff01\u5927\u5bb6\u597d\uff0c\u5f88\u9ad8\u5174\u53c8\u89c1\u9762\u4e86\uff0c\u6211\u662f\u7684\u4e00\u540d\u4f5c\u8005\uff0c\u4eca\u5929\u7531\u6211\u7ed9\u5927\u5bb6\u5e26\u6765\u4e00\u7bc7\uff0c\u672c\u6587\u4e3b\u8981\u4f1a\u8bb2\u5230<span style=\"color: #FF6600;, Helvetica, Arial, sans-serif;font-size: 14px;background-color: #FFFFFF\"><\/span>\u7b49\u7b49\u77e5\u8bc6\u70b9\uff0c\u5e0c\u671b\u5927\u5bb6\u4e00\u8d77\u5b66\u4e60\u8fdb\u6b65\uff0c\u4e5f\u6b22\u8fce\u5927\u5bb6\u5173\u6ce8\u3001\u70b9\u8d5e\u3001\u6536\u85cf\u3001\u8f6c\u53d1! \u4e0b\u9762\u5c31\u4e00\u8d77\u6765\u770b\u770b\u5427\uff01<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/www.17golang.com\/uploads\/20241120\/1732097553673db6110fe5e.jpg\" class=\"aligncenter\"><\/p>\n<p><strong>\u4f7f\u7528 bs4.beautifulsoup \u63d0\u53d6\u5e26\u56de\u8f66\u7b26\u7684\u5217\u8868\u5143\u7d20<\/strong><\/p>\n<p>\u5728\u4f7f\u7528 beautiful soup \u5e93\u6765\u63d0\u53d6\u7f51\u9875\u5143\u7d20\u65f6\uff0c\u9047\u5230\u4e86\u4e00\u4e2a\u95ee\u9898\uff0c\u5373\u63d0\u53d6\u7684\u5143\u7d20\u4e2d\u5305\u542b\u56de\u8f66\u7b26\u3002\u8fd9\u5bfc\u81f4\u4e00\u4e9b\u5143\u7d20\u88ab\u62c6\u5206\u4e3a\u591a\u4e2a\u5143\u7d20\u3002<\/p>\n<p>\u4ee3\u7801\u793a\u4f8b\uff1a<\/p>\n<pre>import requests\nfrom bs4 import beautifulsoup\n\nurl = 'http:\/\/www.pythonscraping.com\/pages\/warandpeace.html'\nhtml = requests.get(url).text\nbs = beautifulsoup(html, 'html.parser')\n\nname_list = bs.find_all('span', {'class': 'green'})\nfor name in name_list:\n    print(name.get_text())<\/pre>\n<p>\u8fd0\u884c\u6b64\u4ee3\u7801\u53ef\u80fd\u4f1a\u5bfc\u81f4\u5b89\u5a1c\u00b7\u5e15\u592b\u6d1b\u592b\u5a1c\u00b7\u820d\u52d2 (anna pavlovna scherer) \u88ab\u62c6\u5206\u4e3a\u4e24\u4e2a\u5143\u7d20\uff1a&#8221;\u5b89\u5a1c\u00b7\u5e15\u592b\u6d1b\u592b\u5a1c&#8221;\u548c&#8221;\u820d\u52d2&#8221;\u3002\u8fd9\u662f\u56e0\u4e3a html \u6e90\u4ee3\u7801\u4e2d\u5b89\u5a1c\u00b7\u5e15\u592b\u6d1b\u592b\u5a1c\u548c\u820d\u52d2\u4e4b\u95f4\u5b58\u5728\u4e00\u4e2a\u6362\u884c\u7b26\u3002<\/p>\n<p>\u53ef\u4ee5\u4f7f\u7528 get_text() \u65b9\u6cd5\u6765\u89e3\u51b3\u6b64\u95ee\u9898\uff0c\u5b83\u4f1a\u5c06\u5143\u7d20\u7684\u5185\u5bb9\u4f5c\u4e3a\u6587\u672c\u8fd4\u56de\u3002\u4f46\u662f\uff0c\u5b83\u4f1a\u4fdd\u7559\u6362\u884c\u7b26\u3002<\/p>\n<p><strong>\u89e3\u51b3\u65b9\u6848\uff1a<\/strong><\/p>\n<p>\u53ef\u4ee5\u4f7f\u7528 replace() \u65b9\u6cd5\u5c06\u56de\u8f66\u7b26\u66ff\u6362\u4e3a\u7a7a\u5b57\u7b26\u4e32\u3002<\/p>\n<pre>for name in name_list:\n    print(name.get_text().replace('\\n', ''))<\/pre>\n<p>\u8fd9\u6837\uff0c\u5b89\u5a1c\u00b7\u5e15\u592b\u6d1b\u592b\u5a1c\u00b7\u820d\u52d2\u5c06\u88ab\u6b63\u786e\u5730\u8bc6\u522b\u4e3a\u4e00\u4e2a\u5143\u7d20\u3002<\/p>\n<p>\u597d\u4e86\uff0c\u672c\u6587\u5230\u6b64\u7ed3\u675f\uff0c\u5e26\u5927\u5bb6\u4e86\u89e3\u4e86\u300aBeautifulSoup\u63d0\u53d6\u5e26\u56de\u8f66\u7b26\u7684\u5217\u8868\u5143\u7d20\u5982\u4f55\u6b63\u786e\u5904\u7406\uff1f\u300b\uff0c\u5e0c\u671b\u672c\u6587\u5bf9\u4f60\u6709\u6240\u5e2e\u52a9\uff01\u5173\u6ce8\u516c\u4f17\u53f7\uff0c\u7ed9\u5927\u5bb6\u5206\u4eab\u66f4\u591a\u6587\u7ae0\u77e5\u8bc6\uff01<\/p>\n","protected":false},"excerpt":{"rendered":"<p>BeautifulSoup\u63d0\u53d6\u5e26&#46;&#46;&#46;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[4925],"tags":[],"class_list":["post-204972","post","type-post","status-publish","format-standard","hentry","category-4925"],"_links":{"self":[{"href":"https:\/\/server.hk\/cnblog\/wp-json\/wp\/v2\/posts\/204972","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/server.hk\/cnblog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/server.hk\/cnblog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/server.hk\/cnblog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/server.hk\/cnblog\/wp-json\/wp\/v2\/comments?post=204972"}],"version-history":[{"count":0,"href":"https:\/\/server.hk\/cnblog\/wp-json\/wp\/v2\/posts\/204972\/revisions"}],"wp:attachment":[{"href":"https:\/\/server.hk\/cnblog\/wp-json\/wp\/v2\/media?parent=204972"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/server.hk\/cnblog\/wp-json\/wp\/v2\/categories?post=204972"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/server.hk\/cnblog\/wp-json\/wp\/v2\/tags?post=204972"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}