The arachn.io API uses proprietary implementations of core web standards like JSON-LD, the OpenGraph Protocol, and Schema.org Structured Microdata combined with cutting-edge artificial intelligence techniques to create structured data from web addresses and public webpages.
Of course! Anyone can use the Free Forever Plan to try out the API. It includes all of the API's core endpoints, including unwind and extract.
URL and Hostname parsing are an important part of undirected web crawling and content analysis.
It's easy for websites to link back to themselves or to other websites, but harder to get other websites to link back to them.
The arachn.io API allows code to distinguish between internal and external link types and detect valuable outlinks.
Many of the most valuable links online today, particularly those embedded in social media, use so-called "link shorteners" like bit.ly and t.co that hide the real target of a link.
The arachn.io API unwinds links to reveal the link's actual target, and the target's canonical form if possible.
Most of the HTML on a webpage is useless. For example, the navigation bar is typically exactly the same on every page of a website!
On most webpages, especially articles, the page's "body" is the important content, but there is no standard, well-adopted way in which the page's body is demarcated.
The arachn.io API uses proprietary algorithms and cutting-edge artificial intelligence to find, extract, and structure this valuable content.