Beautiful Soup is a self-contained parser that makes screen-scraping easy. It parses both good and bad HTML and XML and offers methods for traversing the parse tree and extracting specific parts of a document.
|Tags||Text Processing Markup XML HTML/XHTML|
|Operating Systems||OS Independent|
Release Notes: Beautiful Soup can now convert invalid HTML or XML into something approaching XHTML or valid XML.
Release Notes: This release escapes all special XML characters contained in attribute values. 2.x method names have been reintroduced for backwards compatibility. There are other minor bugfixes.
Release Notes: Beautiful Soup now autodetects document encodings and converts them to Unicode. Methods have been added for manipulating the parse tree. You can now parse only part of a document, saving time. The API has been cleaned.
Release Notes: Several parsing bugfixes and a fix for a serious performance problem were made.
Release Notes: Several new ways to search a parse tree were added. Some minor bugs were fixed. Search performance was improved.