Pyppeteer

Headless chrome/chromium automation library (unofficial port of puppeteer)
Alternatives To Pyppeteer
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
Puppeteer84,65512,12814,02610 hours ago828July 20, 2023279apache-2.0TypeScript
Node.js API for Chrome
Gotenberg5,595
7 days ago35July 20, 202358mitGo
A developer-friendly API for converting numerous document formats into PDF files, and more!
Headless Chrome Crawler5,05110122 years ago21June 11, 201828mitJavaScript
Distributed crawler powered by Headless Chrome
Browsershot4,44498273 days ago132July 27, 20234mitPHP
Convert HTML to an image, PDF or string
Flaresolverr3,613
2 days ago36mitPython
Proxy server to bypass Cloudflare protection
Browser Fingerprinting3,353
7 months ago7JavaScript
Analysis of Bot Protection systems with available countermeasures 🚿. How to defeat anti-bot system 👻 and get around browser fingerprinting scripts 🕵️‍♂️ when scraping the web?
Pyppeteer3,240
3 years ago154otherPython
Headless chrome/chromium automation library (unofficial port of puppeteer)
Chrome Aws Lambda3,01166894 months ago66July 17, 202166mitTypeScript
Chromium Binary for AWS Lambda and Google Cloud Functions
Puppeteer Sharp2,845271103 days ago68August 16, 2023121mitC#
Headless Chrome .NET API
Pyppeteer2,8331851778 months ago31January 11, 2022173otherPython
Headless chrome/chromium automation library (unofficial port of puppeteer)
Alternatives To Pyppeteer
Select To Compare


Alternative Project Comparisons
Readme

Pyppeteer

PyPI PyPI version Documentation Travis status AppVeyor status codecov

Unofficial Python port of puppeteer JavaScript (headless) chrome/chromium browser automation library.

Installation

Pyppeteer requires python 3.6+. (experimentally supports python 3.5)

Install by pip from PyPI:

python3 -m pip install pyppeteer

Or install latest version from github:

python3 -m pip install -U git+https://github.com/miyakogi/pyppeteer.git@dev

Usage

Note: When you run pyppeteer first time, it downloads a recent version of Chromium (~100MB). If you don't prefer this behavior, run pyppeteer-install command before running scripts which uses pyppeteer.

Example: open web page and take a screenshot.

import asyncio
from pyppeteer import launch

async def main():
    browser = await launch()
    page = await browser.newPage()
    await page.goto('http://example.com')
    await page.screenshot({'path': 'example.png'})
    await browser.close()

asyncio.get_event_loop().run_until_complete(main())

Example: evaluate script on the page.

import asyncio
from pyppeteer import launch

async def main():
    browser = await launch()
    page = await browser.newPage()
    await page.goto('http://example.com')
    await page.screenshot({'path': 'example.png'})

    dimensions = await page.evaluate('''() => {
        return {
            width: document.documentElement.clientWidth,
            height: document.documentElement.clientHeight,
            deviceScaleFactor: window.devicePixelRatio,
        }
    }''')

    print(dimensions)
    # >>> {'width': 800, 'height': 600, 'deviceScaleFactor': 1}
    await browser.close()

asyncio.get_event_loop().run_until_complete(main())

Pyppeteer has almost same API as puppeteer. More APIs are listed in the document.

Puppeteer's document and troubleshooting are also useful for pyppeteer users.

Differences between puppeteer and pyppeteer

Pyppeteer is to be as similar as puppeteer, but some differences between python and JavaScript make it difficult.

These are differences between puppeteer and pyppeteer.

Keyword arguments for options

Puppeteer uses object (dictionary in python) for passing options to functions/methods. Pyppeteer accepts both dictionary and keyword arguments for options.

Dictionary style option (similar to puppeteer):

browser = await launch({'headless': True})

Keyword argument style option (more pythonic, isn't it?):

browser = await launch(headless=True)

Element selector method name ($ -> querySelector)

In python, $ is not usable for method name. So pyppeteer uses Page.querySelector()/Page.querySelectorAll()/Page.xpath() instead of Page.$()/Page.$$()/Page.$x(). Pyppeteer also has shorthands for these methods, Page.J(), Page.JJ(), and Page.Jx().

Arguments of Page.evaluate() and Page.querySelectorEval()

Puppeteer's version of evaluate() takes JavaScript raw function or string of JavaScript expression, but pyppeteer takes string of JavaScript. JavaScript strings can be function or expression. Pyppeteer tries to automatically detect the string is function or expression, but sometimes it fails. If expression string is treated as function and error is raised, add force_expr=True option, which force pyppeteer to treat the string as expression.

Example to get page content:

content = await page.evaluate('document.body.textContent', force_expr=True)

Example to get element's inner text:

element = await page.querySelector('h1')
title = await page.evaluate('(element) => element.textContent', element)

Future Plan

  1. Catch up development of puppeteer
    • Not intend to add original API which puppeteer does not have

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

Popular Puppeteer Projects
Popular Chromium Projects
Popular Web Browsers Categories
Related Searches

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Javascript
Python
Chromium
Puppeteer