PhearJS

PhearJS

Render dynamic Javascript webpages to JSON

Fancy JS

PhearJS renders dynamic webpages using PhantomJS: fetch a page, render it and return a pretty, machine-readable JSON object.

Many websites rely on Javascript for data via AJAX and front-end rendering. When a machine (e.g. curl) requests such a page it only 'sees' an empty page.

This is a problem when you want to get a static copy of a dynamic page, e.g. for SEO purposes, web scraping or data mining. PhearJS fixes this by rendering pages in a headless PhantomJS browser and returning a fancy JSON object containing the rendered page + meta data about the response.

Becomes fancy JSON

{
    "success": true,
    "input_url": "http://such-website.com",
    "final_url": "http://www.such-website.com/",
    "had_js_errors": false,
    "content": "<html>rendered</html>",
    "request_headers": {},
    "response_headers": {
        "date": "Sun, 08 Feb 2015 15:11:22 GMT",
        "content-encoding": "gzip",
        "cache-control": "max-age=60",
        "content-type": "text/html; charset=utf-8"
    }
}
      

More power

Exception handling

The internet is a messy place and exceptions aren't rare. Handling those, like timeouts, is what PhearJS does for you.

Metadata

PhearJS returns metadata about the request. Where there any redirects, errors and headers?

Caching

Requests are cached using Memcached. This allows you to return results quickly by pre-rendering. Forced cache busts are possible.

Configuration

PhearJS is configurable. You can create a white-list for allowed clients by IP, set the number of workers to spawn, set the delay on parsing and more.