Microlink API: Browser automation

January 26, 2020 ()
A web browser is one of the most complex pieces of software, with some internal sub-systems that work together for resolving any kind of URL on the Internet, even if the content was written with HTML tables in 1992.
Microlink API is a service that provides a high-level API to control a browser instance in the simplest way possible, where the different features can be enabled or disabled using query parameters.
When we started the service, just a few things could be done. Now, we're supporting +30 query parameters.
Just url is the only parameter that needs to be specified, but also any of the following query parameters:

Data

Enrich the response payload for detecting data from the target URL.
  • audio: enables audio source detection from the target URL.
  • data: gets specific content extraction from the target URL.
  • iframe: gets, if it's possible, the embedded representation of the target URL.
  • insights: gets lighthouse perfomance metrics from the target URL.
  • meta: gets unified medata from the target URL.
  • palette: gets color information over any image present on the response data.
  • pdf: gets a PDF over the target URL.
  • screenshot: takes a screenshot over the target URL.
  • video: enables video source detection from the target URL.

Browser

Tell the browser to act in a certain way or perform some tasks.
  • adblock: enable/disable adblock over abusive third-party content over the browser page.
  • animations: enable/disable CSS animations and transitions into the browser page.
  • click: clicks DOM elements matching the given CSS selectors.
  • codeScheme: sets the code syntax highlighting color theme to use.
  • colorScheme: sets preferred browser color theme preference.
  • device: emulates an specific device (viewport, user agent, dimensions, etc).
  • hide: sets visibility: hidden on the matched elements.
  • javascript: enable/disable the javascript engine on the entire browser page.
  • mediaType: changes the CSS media type of the page.
  • modules: injects <script type="module"> into the browser page.
  • ping: enable/disable to resolve all URLs present into the payload.
  • prerender: enable/disable browser navigation.
  • proxy: uses a proxy server as an intermediary during the requests.
  • remove: sets display: none on the matched elements.
  • scripts: injects <script> into the browser page.
  • scroll: scrolls to the DOM element matching the given CSS selector.
  • styles: injects <style> into the browser page.
  • viewport: establishes a set of properties related with the browser visible area.
  • waitFor: waits a quantity of time or selector before processing the content of the browser page.
  • waitUntil: ensures to wait browser event(s) before considering navigation succeeded.

Response

Apply some modifications over the response data for better accommodation.
  • embed: embed a specific response data field respecting the content type.
  • filter: filters a list of properties from the response data for bandwidth saving.
  • force: forces a new fresh response data bypassing the cache layer.
  • headers: customizes requests using custom HTTP headers.
  • ttl: configures the cache layer specifying the time to live.

Come chat with us

All of these improvements or features are community driven: We listen to your feedback and act accordingly.
Whether you are are building a product and you need fancy previews, you’re an indie hacker or simply you like frontend stuff, come with us 🙂.