Finding out what pages are available on the internet is the first step. Google must continuously search for new and updated pages to add to its list of known pages because there isn’t a single registry of all web pages. “URL discovery” is the term for this procedure. Certain pages are well-known due to Google’s prior visitation. When Google follows a link from one well-known website to another, it finds new pages. For instance, a hub page, such a category page, may link to a just published blog post. When you provide Google with a list of pages (a sitemap), it finds other pages.
Google may visit (or “crawl”) a page to see what’s on it after finding the URL. It crawls billions of web pages using a massive array of machines. Googlebot, which is sometimes referred to as a crawler, robot, bot, or spider, is the programme that performs the fetching. Googlebot determines which websites to crawl, how frequently to do so, and how many pages to retrieve from each website using an algorithmic approach. In order to prevent overloading the website, Google’s crawlers are also designed to attempt to avoid crawling it too quickly.
Googlebot does not, however, crawl every page it finds. The owner of the website may prohibit crawling certain pages, and logging in to access other pages may be necessary.
Similar to how your browser produces pages you view, Google uses a recent version of Chrome to render the page and execute any JavaScript it discovers during the crawl. Because JavaScript is frequently used by websites to add material, rendering is crucial because without rendering, Google might not be able to see that content.