Skip to content

Choosing scope

Most Markdown quality issues are really scope issues. The rule decides which HTML becomes Markdown, and the page state decides whether that HTML is clean, complete, and ready to serialize.

Start with the smallest useful wrapper

Whole-page conversion is fine for a first pass:

The following examples show how to use the Microlink API with CLI, cURL, JavaScript, Python, Ruby, PHP & Golang, targeting 'https://example.com' URL with 'data' & 'meta' API parameters:

CLI Microlink API example

microlink https://example.com&data.content.attr=markdown

cURL Microlink API example

curl -G "https://api.microlink.io" \
  -d "url=https://example.com" \
  -d "data.content.attr=markdown" \
  -d "meta=false"

JavaScript Microlink API example

import mql from '@microlink/mql'

const { data } = await mql('https://example.com', {
  data: {
    content: {
      attr: "markdown"
    }
  },
  meta: false
})

Python Microlink API example

import requests

url = "https://api.microlink.io/"

querystring = {
    "url": "https://example.com",
    "data.content.attr": "markdown",
    "meta": "false"
}

response = requests.get(url, params=querystring)

print(response.json())

Ruby Microlink API example

require 'uri'
require 'net/http'

base_url = "https://api.microlink.io/"

params = {
  url: "https://example.com",
  data.content.attr: "markdown",
  meta: "false"
}

uri = URI(base_url)
uri.query = URI.encode_www_form(params)

http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true

request = Net::HTTP::Get.new(uri)
response = http.request(request)

puts response.body

PHP Microlink API example

<?php

$baseUrl = "https://api.microlink.io/";

$params = [
    "url" => "https://example.com",
    "data.content.attr" => "markdown",
    "meta" => "false"
];

$query = http_build_query($params);
$url = $baseUrl . '?' . $query;

$curl = curl_init();

curl_setopt_array($curl, [
    CURLOPT_URL => $url,
    CURLOPT_RETURNTRANSFER => true,
    CURLOPT_ENCODING => "",
    CURLOPT_MAXREDIRS => 10,
    CURLOPT_TIMEOUT => 30,
    CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
    CURLOPT_CUSTOMREQUEST => "GET"
]);

$response = curl_exec($curl);
$err = curl_error($curl);

curl_close($curl);

if ($err) {
    echo "cURL Error #: " . $err;
} else {
    echo $response;
}

Golang Microlink API example

package main

import (
    "fmt"
    "net/http"
    "net/url"
    "io"
)

func main() {
    baseURL := "https://api.microlink.io"

    u, err := url.Parse(baseURL)
    if err != nil {
        panic(err)
    }
    q := u.Query()
    q.Set("url", "https://example.com")
    q.Set("data.content.attr", "markdown")
    q.Set("meta", "false")
    u.RawQuery = q.Encode()

    req, err := http.NewRequest("GET", u.String(), nil)
    if err != nil {
        panic(err)
    }

    client := &http.Client{}
    resp, err := client.Do(req)
    if err != nil {
        panic(err)
    }
    defer resp.Body.Close()

    body, err := io.ReadAll(resp.Body)
    if err != nil {
        panic(err)
    }

    fmt.Println(string(body))
}
Use whole-page conversion to prototype quickly or inspect what the site exposes before you tighten the scope.
For production use, a scoped wrapper is usually better:

The following examples show how to use the Microlink API with CLI, cURL, JavaScript, Python, Ruby, PHP & Golang, targeting 'https://microlink.io/docs/api/getting-started/overview' URL with 'data' & 'meta' API parameters:

CLI Microlink API example

microlink https://microlink.io/docs/api/getting-started/overview&data.article.selector=main&data.article.attr=markdown

cURL Microlink API example

curl -G "https://api.microlink.io" \
  -d "url=https://microlink.io/docs/api/getting-started/overview" \
  -d "data.article.selector=main" \
  -d "data.article.attr=markdown" \
  -d "meta=false"

JavaScript Microlink API example

import mql from '@microlink/mql'

const { data } = await mql('https://microlink.io/docs/api/getting-started/overview', {
  data: {
    article: {
      selector: "main",
      attr: "markdown"
    }
  },
  meta: false
})

Python Microlink API example

import requests

url = "https://api.microlink.io/"

querystring = {
    "url": "https://microlink.io/docs/api/getting-started/overview",
    "data.article.selector": "main",
    "data.article.attr": "markdown",
    "meta": "false"
}

response = requests.get(url, params=querystring)

print(response.json())

Ruby Microlink API example

require 'uri'
require 'net/http'

base_url = "https://api.microlink.io/"

params = {
  url: "https://microlink.io/docs/api/getting-started/overview",
  data.article.selector: "main",
  data.article.attr: "markdown",
  meta: "false"
}

uri = URI(base_url)
uri.query = URI.encode_www_form(params)

http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true

request = Net::HTTP::Get.new(uri)
response = http.request(request)

puts response.body

PHP Microlink API example

<?php

$baseUrl = "https://api.microlink.io/";

$params = [
    "url" => "https://microlink.io/docs/api/getting-started/overview",
    "data.article.selector" => "main",
    "data.article.attr" => "markdown",
    "meta" => "false"
];

$query = http_build_query($params);
$url = $baseUrl . '?' . $query;

$curl = curl_init();

curl_setopt_array($curl, [
    CURLOPT_URL => $url,
    CURLOPT_RETURNTRANSFER => true,
    CURLOPT_ENCODING => "",
    CURLOPT_MAXREDIRS => 10,
    CURLOPT_TIMEOUT => 30,
    CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
    CURLOPT_CUSTOMREQUEST => "GET"
]);

$response = curl_exec($curl);
$err = curl_error($curl);

curl_close($curl);

if ($err) {
    echo "cURL Error #: " . $err;
} else {
    echo $response;
}

Golang Microlink API example

package main

import (
    "fmt"
    "net/http"
    "net/url"
    "io"
)

func main() {
    baseURL := "https://api.microlink.io"

    u, err := url.Parse(baseURL)
    if err != nil {
        panic(err)
    }
    q := u.Query()
    q.Set("url", "https://microlink.io/docs/api/getting-started/overview")
    q.Set("data.article.selector", "main")
    q.Set("data.article.attr", "markdown")
    q.Set("meta", "false")
    u.RawQuery = q.Encode()

    req, err := http.NewRequest("GET", u.String(), nil)
    if err != nil {
        panic(err)
    }

    client := &http.Client{}
    resp, err := client.Do(req)
    if err != nil {
        panic(err)
    }
    defer resp.Body.Close()

    body, err := io.ReadAll(resp.Body)
    if err != nil {
        panic(err)
    }

    fmt.Println(string(body))
}
Start with semantic wrappers like main or article. It is usually more reliable than cleaning the full page after the fact.
If you needUse
A quick draft of the entire pageOmit selector
Cleaner docs or article contentselector: 'main' or selector: 'article'
One specific block or panelA more precise CSS selector

Add supporting fields when markdown alone is not enough

Markdown is often only one field in a larger payload:

The following examples show how to use the Microlink API with CLI, cURL, JavaScript, Python, Ruby, PHP & Golang, targeting 'https://microlink.io/docs/api/getting-started/overview' URL with 'data' & 'meta' API parameters:

CLI Microlink API example

microlink https://microlink.io/docs/api/getting-started/overview&data.title.selector='main h1'&data.title.attr=text&data.article.selector=main&data.article.attr=markdown&data.headings.selectorAll='main h2'&data.headings.attr=text

cURL Microlink API example

curl -G "https://api.microlink.io" \
  -d "url=https://microlink.io/docs/api/getting-started/overview" \
  -d "data.title.selector=main%20h1" \
  -d "data.title.attr=text" \
  -d "data.article.selector=main" \
  -d "data.article.attr=markdown" \
  -d "data.headings.selectorAll=main%20h2" \
  -d "data.headings.attr=text" \
  -d "meta=false"

JavaScript Microlink API example

import mql from '@microlink/mql'

const { data } = await mql('https://microlink.io/docs/api/getting-started/overview', {
  data: {
    title: {
      selector: "main h1",
      attr: "text"
    },
    article: {
      selector: "main",
      attr: "markdown"
    },
    headings: {
      selectorAll: "main h2",
      attr: "text"
    }
  },
  meta: false
})

Python Microlink API example

import requests

url = "https://api.microlink.io/"

querystring = {
    "url": "https://microlink.io/docs/api/getting-started/overview",
    "data.title.selector": "main h1",
    "data.title.attr": "text",
    "data.article.selector": "main",
    "data.article.attr": "markdown",
    "data.headings.selectorAll": "main h2",
    "data.headings.attr": "text",
    "meta": "false"
}

response = requests.get(url, params=querystring)

print(response.json())

Ruby Microlink API example

require 'uri'
require 'net/http'

base_url = "https://api.microlink.io/"

params = {
  url: "https://microlink.io/docs/api/getting-started/overview",
  data.title.selector: "main h1",
  data.title.attr: "text",
  data.article.selector: "main",
  data.article.attr: "markdown",
  data.headings.selectorAll: "main h2",
  data.headings.attr: "text",
  meta: "false"
}

uri = URI(base_url)
uri.query = URI.encode_www_form(params)

http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true

request = Net::HTTP::Get.new(uri)
response = http.request(request)

puts response.body

PHP Microlink API example

<?php

$baseUrl = "https://api.microlink.io/";

$params = [
    "url" => "https://microlink.io/docs/api/getting-started/overview",
    "data.title.selector" => "main h1",
    "data.title.attr" => "text",
    "data.article.selector" => "main",
    "data.article.attr" => "markdown",
    "data.headings.selectorAll" => "main h2",
    "data.headings.attr" => "text",
    "meta" => "false"
];

$query = http_build_query($params);
$url = $baseUrl . '?' . $query;

$curl = curl_init();

curl_setopt_array($curl, [
    CURLOPT_URL => $url,
    CURLOPT_RETURNTRANSFER => true,
    CURLOPT_ENCODING => "",
    CURLOPT_MAXREDIRS => 10,
    CURLOPT_TIMEOUT => 30,
    CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
    CURLOPT_CUSTOMREQUEST => "GET"
]);

$response = curl_exec($curl);
$err = curl_error($curl);

curl_close($curl);

if ($err) {
    echo "cURL Error #: " . $err;
} else {
    echo $response;
}

Golang Microlink API example

package main

import (
    "fmt"
    "net/http"
    "net/url"
    "io"
)

func main() {
    baseURL := "https://api.microlink.io"

    u, err := url.Parse(baseURL)
    if err != nil {
        panic(err)
    }
    q := u.Query()
    q.Set("url", "https://microlink.io/docs/api/getting-started/overview")
    q.Set("data.title.selector", "main h1")
    q.Set("data.title.attr", "text")
    q.Set("data.article.selector", "main")
    q.Set("data.article.attr", "markdown")
    q.Set("data.headings.selectorAll", "main h2")
    q.Set("data.headings.attr", "text")
    q.Set("meta", "false")
    u.RawQuery = q.Encode()

    req, err := http.NewRequest("GET", u.String(), nil)
    if err != nil {
        panic(err)
    }

    client := &http.Client{}
    resp, err := client.Do(req)
    if err != nil {
        panic(err)
    }
    defer resp.Body.Close()

    body, err := io.ReadAll(resp.Body)
    if err != nil {
        panic(err)
    }

    fmt.Println(string(body))
}
Keep the main body as Markdown, then add the extra text fields your indexer, CMS, or LLM pipeline still needs.
This is usually as far as Markdown-specific rule design needs to go. If you need nested objects, typed collections, or evaluate, jump to Data extraction: Defining rules.

Prepare the page before conversion

Once the scope is right, make sure the wrapper is actually ready:

The following examples show how to use the Microlink API with CLI, cURL, JavaScript, Python, Ruby, PHP & Golang, targeting 'https://microlink.io' URL with 'data', 'meta', 'waitUntil', 'waitForSelector', 'click' & 'styles' API parameters:

CLI Microlink API example

microlink https://microlink.io&data.content.selector=main&data.content.attr=markdown&waitUntil=domcontentloaded&waitForSelector=main&click=#features&styles='header, footer { display: none !important; }'

cURL Microlink API example

curl -G "https://api.microlink.io" \
  -d "url=https://microlink.io" \
  -d "data.content.selector=main" \
  -d "data.content.attr=markdown" \
  -d "meta=false" \
  -d "waitUntil=domcontentloaded" \
  -d "waitForSelector=main" \
  -d "click=#features" \
  -d "styles=header%2C%20footer%20%7B%20display%3A%20none%20!important%3B%20%7D"

JavaScript Microlink API example

import mql from '@microlink/mql'

const { data } = await mql('https://microlink.io', {
  data: {
    content: {
      selector: "main",
      attr: "markdown"
    }
  },
  meta: false,
  waitUntil: "domcontentloaded",
  waitForSelector: "main",
  click: "#features",
  styles: [
    "header, footer { display: none !important; }"
  ]
})

Python Microlink API example

import requests

url = "https://api.microlink.io/"

querystring = {
    "url": "https://microlink.io",
    "data.content.selector": "main",
    "data.content.attr": "markdown",
    "meta": "false",
    "waitUntil": "domcontentloaded",
    "waitForSelector": "main",
    "click": "#features",
    "styles": "header, footer { display: none !important; }"
}

response = requests.get(url, params=querystring)

print(response.json())

Ruby Microlink API example

require 'uri'
require 'net/http'

base_url = "https://api.microlink.io/"

params = {
  url: "https://microlink.io",
  data.content.selector: "main",
  data.content.attr: "markdown",
  meta: "false",
  waitUntil: "domcontentloaded",
  waitForSelector: "main",
  click: "#features",
  styles: "header, footer { display: none !important; }"
}

uri = URI(base_url)
uri.query = URI.encode_www_form(params)

http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true

request = Net::HTTP::Get.new(uri)
response = http.request(request)

puts response.body

PHP Microlink API example

<?php

$baseUrl = "https://api.microlink.io/";

$params = [
    "url" => "https://microlink.io",
    "data.content.selector" => "main",
    "data.content.attr" => "markdown",
    "meta" => "false",
    "waitUntil" => "domcontentloaded",
    "waitForSelector" => "main",
    "click" => "#features",
    "styles" => "header, footer { display: none !important; }"
];

$query = http_build_query($params);
$url = $baseUrl . '?' . $query;

$curl = curl_init();

curl_setopt_array($curl, [
    CURLOPT_URL => $url,
    CURLOPT_RETURNTRANSFER => true,
    CURLOPT_ENCODING => "",
    CURLOPT_MAXREDIRS => 10,
    CURLOPT_TIMEOUT => 30,
    CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
    CURLOPT_CUSTOMREQUEST => "GET"
]);

$response = curl_exec($curl);
$err = curl_error($curl);

curl_close($curl);

if ($err) {
    echo "cURL Error #: " . $err;
} else {
    echo $response;
}

Golang Microlink API example

package main

import (
    "fmt"
    "net/http"
    "net/url"
    "io"
)

func main() {
    baseURL := "https://api.microlink.io"

    u, err := url.Parse(baseURL)
    if err != nil {
        panic(err)
    }
    q := u.Query()
    q.Set("url", "https://microlink.io")
    q.Set("data.content.selector", "main")
    q.Set("data.content.attr", "markdown")
    q.Set("meta", "false")
    q.Set("waitUntil", "domcontentloaded")
    q.Set("waitForSelector", "main")
    q.Set("click", "#features")
    q.Set("styles", "header, footer { display: none !important; }")
    u.RawQuery = q.Encode()

    req, err := http.NewRequest("GET", u.String(), nil)
    if err != nil {
        panic(err)
    }

    client := &http.Client{}
    resp, err := client.Do(req)
    if err != nil {
        panic(err)
    }
    defer resp.Body.Close()

    body, err := io.ReadAll(resp.Body)
    if err != nil {
        panic(err)
    }

    fmt.Println(string(body))
}
Wait for the content wrapper, open the state you need, and hide the chrome that should not end up in the Markdown.
Use the preparation tools in this order:
  1. Better selector
  2. waitForSelector
  3. click or scroll
  4. styles
A few good defaults:
  • Use prerender: false when the HTML already contains the content you need.
  • Use prerender: true when a client-rendered page stays empty without a browser.
  • Prefer waitForSelector over waitForTimeout.
  • Keep adblock: true unless you explicitly need banners or ads in the output.

Use fallbacks when wrappers vary

When the same content lives under different wrappers across pages, define multiple rules in priority order:

The following examples show how to use the Microlink API with CLI, cURL, JavaScript, Python, Ruby, PHP & Golang, targeting 'https://microlink.io/docs/api/getting-started/overview' URL with 'data' & 'meta' API parameters:

CLI Microlink API example

microlink https://microlink.io/docs/api/getting-started/overview&data.content='[object Object],[object Object],[object Object]'

cURL Microlink API example

curl -G "https://api.microlink.io" \
  -d "url=https://microlink.io/docs/api/getting-started/overview" \
  -d "data.content=%5Bobject%20Object%5D%2C%5Bobject%20Object%5D%2C%5Bobject%20Object%5D" \
  -d "meta=false"

JavaScript Microlink API example

import mql from '@microlink/mql'

const { data } = await mql('https://microlink.io/docs/api/getting-started/overview', {
  data: {
    content: [
      {
        selector: "article",
        attr: "markdown"
      },
      {
        selector: "main",
        attr: "markdown"
      },
      {
        attr: "markdown"
      }
    ]
  },
  meta: false
})

Python Microlink API example

import requests

url = "https://api.microlink.io/"

querystring = {
    "url": "https://microlink.io/docs/api/getting-started/overview",
    "data.content": "[object Object],[object Object],[object Object]",
    "meta": "false"
}

response = requests.get(url, params=querystring)

print(response.json())

Ruby Microlink API example

require 'uri'
require 'net/http'

base_url = "https://api.microlink.io/"

params = {
  url: "https://microlink.io/docs/api/getting-started/overview",
  data.content: "[object Object],[object Object],[object Object]",
  meta: "false"
}

uri = URI(base_url)
uri.query = URI.encode_www_form(params)

http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true

request = Net::HTTP::Get.new(uri)
response = http.request(request)

puts response.body

PHP Microlink API example

<?php

$baseUrl = "https://api.microlink.io/";

$params = [
    "url" => "https://microlink.io/docs/api/getting-started/overview",
    "data.content" => "[object Object],[object Object],[object Object]",
    "meta" => "false"
];

$query = http_build_query($params);
$url = $baseUrl . '?' . $query;

$curl = curl_init();

curl_setopt_array($curl, [
    CURLOPT_URL => $url,
    CURLOPT_RETURNTRANSFER => true,
    CURLOPT_ENCODING => "",
    CURLOPT_MAXREDIRS => 10,
    CURLOPT_TIMEOUT => 30,
    CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
    CURLOPT_CUSTOMREQUEST => "GET"
]);

$response = curl_exec($curl);
$err = curl_error($curl);

curl_close($curl);

if ($err) {
    echo "cURL Error #: " . $err;
} else {
    echo $response;
}

Golang Microlink API example

package main

import (
    "fmt"
    "net/http"
    "net/url"
    "io"
)

func main() {
    baseURL := "https://api.microlink.io"

    u, err := url.Parse(baseURL)
    if err != nil {
        panic(err)
    }
    q := u.Query()
    q.Set("url", "https://microlink.io/docs/api/getting-started/overview")
    q.Set("data.content", "[object Object],[object Object],[object Object]")
    q.Set("meta", "false")
    u.RawQuery = q.Encode()

    req, err := http.NewRequest("GET", u.String(), nil)
    if err != nil {
        panic(err)
    }

    client := &http.Client{}
    resp, err := client.Do(req)
    if err != nil {
        panic(err)
    }
    defer resp.Body.Close()

    body, err := io.ReadAll(resp.Body)
    if err != nil {
        panic(err)
    }

    fmt.Println(string(body))
}
Try the cleanest wrapper first, then fall back to broader rules only when needed.
This is the most useful Markdown-specific fallback pattern because it lets you keep the output clean without inventing separate per-site extractors too early.

Fix the most common markdown problems

  • Empty output: add waitForSelector, then try prerender: true.
  • Noisy output: tighten the selector before you reach for CSS cleanup.
  • Sticky headers, nav, or banners inside the result: use click, scroll, or styles before conversion.
  • Inconsistent wrappers across pages: use fallback rules instead of one fragile selector.

Use data extraction for the deeper dives

The Markdown guide should stay focused on scoping and conversion. Continue with Data extraction when you need:
  • nested objects or list-of-objects responses
  • collections that carry several fields per item
  • typed fields such as URLs, dates, images, or numbers
  • evaluate for custom browser-side extraction logic
  • device, viewport, media type, or browser automation details
  • timeout, auth, proxy, and advanced troubleshooting paths
The most relevant deeper pages are:
For the low-level syntax, see the MQL references for selector, selectorAll, attr, type, and evaluate.

Next step

Learn how to serve Markdown as JSON or a direct response in Delivery and response shaping.