Build Your Own Search Engine: Python Programming Series
in this tutorial, i'll use a web framework called django to build a search app. django is a popular framework that is used to create websites and web apps. django comes with a bunch of tools that make web development easier.
build your own search engine: python programming series
mapping these ideas to a search engine, we can see that a search engine needs to handle a number of core functions. the search index needs to store all the terms the user queries against, along with a scoring system that indicates how useful those terms are to the user.
before google, the vast majority of search was done using a tld-based mechanism. the robust query protocol of google is a close second. when you type in a url, google scans the page content for keywords. if the keywords appear on the page, google ranks the site. if the words appear in a link, google follows the link and reads the page content to determine if that page is relevant.
google search relies on a series of high-level ranking functions to determine which websites are most relevant to the user. while many search engines focus on algorithmic solutions, googles approach is to cluster the search results based on the user experience on a webpage, or more generally, the user experience of a website.
search engines like google provide some free apis to both developers and consumers. although not generally open source, googles apis are usually released as libraries which can be downloaded and built upon.
if a search engine relies on apis from other sources, it may not work well when the api is updated. the wrapper approach, however, allows the search engine to be highly customized to the needs of the programmer.