カテゴリー別アーカイブ: Beautiful Soup

Python Beautiful Soup 4

こんな感じでHTMLのパースが可能です.

$ python
Python 2.7.5 (default, Mar  9 2014, 22:15:05) 
[GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.0.68)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import requests
>>> from bs4 import BeautifulSoup
>>> res = requests.get('http://gihyo.jp/')
>>> soup = BeautifulSoup(res.text, 'lxml')
>>> print(soup)