Microlink API:
Browser automation

January 26, 2020 ()

A web browser is one of the most complex pieces of software, with some internal sub-systems that work together for resolving any kind of URL on the Internet, even if the content was written with HTML tables in 1992.
Microlink API is a service that provides a high-level API to control a browser instance in the simplest way possible, where the different features can be enabled or disabled using query parameters.
When we started the service, just a few things could be done. Now, we’re supporting +30 query parameters.
Just url is the only parameter that needs to be specified, but also any of the following query parameters:

Data

Enrich the response payload for detecting data from the target URL.
  • audio: enables audio source detection from the target URL.
  • data: gets specific content extraction from the target URL.
  • filename: defines the filename asset generated.
  • function: runs JavaScript code with runtime access to a headless browser.
  • iframe: gets, if it's possible, the embedded representation of the target URL.
  • insights: gets lighthouse performance metrics from the target URL.
  • meta: gets unified medata from the target URL.
  • palette: gets color information over any image present on the response data.
  • pdf: gets a PDF over the target URL.
  • screenshot: takes a screenshot over the target URL.
  • video: enables video source detection from the target URL.

Browser

Tell the browser to act in a certain way or perform some tasks.
  • adblock: enable/disable adblock over abusive third-party content over the browser page.
  • animations: enable/disable CSS animations and transitions into the browser page.
  • click: clicks DOM elements matching the given CSS selectors.
  • codeScheme: sets the code syntax highlighting color theme to use.
  • colorScheme: sets preferred browser color theme preference.
  • device: emulates an specific device (viewport, user agent, dimensions, etc).
  • javascript: enable/disable the javascript engine on the entire browser page.
  • mediaType: changes the CSS media type of the page.
  • modules: injects <script type="module"> into the browser page.
  • ping: enable/disable to resolve all URLs present into the payload.
  • prerender: enable/disable browser navigation.
  • proxy: uses a proxy server as an intermediary during the requests.
  • retry: sets the number of exponential backoff retries to perform under an unexpected browser error.
  • scripts: injects <script> into the browser page.
  • scroll: scrolls to the DOM element matching the given CSS selector.
  • styles: injects <style> into the browser page.
  • viewport: establishes a set of properties related with the browser visible area.
  • waitForSelector: waits for a CSS selector(s) to appear in page.
  • waitForTimeout: waits a quantity of time in milliseconds before processing the content of the browser page.
  • waitUntil: waits browser event(s) before considering navigation succeeded.

Response

Apply some modifications over the response data for better accommodation.
  • embed: embed a specific response data field respecting the content type.
  • filter: filters a list of properties from the response data for bandwidth saving.
  • force: forces a new fresh response data bypassing the cache layer.
  • headers: customizes requests using custom HTTP headers.
  • timeout: defines maximum quantity of time allowed for resolving a request.
  • ttl: establishes the cache layer specifying the time-to-live before refresh a resource.
  • staleTtl: establishes the cache layer specifying when a resource can be considered stale, refreshing on the background.

Join the community

All of these improvements or features are community driven: We listen to your feedback and act accordingly.
Whether you are are building a product and you need fancy previews, you’re an indie hacker or simply you like frontend stuff, come chat with us 🙂.