Machine Learning Solution Fix for Internal Link Structure on Expedia, a 18 Million Pages Travel Site

A Enterprise Technical SEO case study by In Marketing We Trust


Challenges

  • 18 Million pages site with too many page templates
  • Over 50k links pointing to 404 pages
  • Multiple teams working on website
  • Inventory changing frequently
  • Time consuming update every 6 months
  • Needed to use historical solution and include manual overrides

 


Results

  • Over 50k links pointing to 404s removed
  • 25% of useless links removed
  • Designed a new architecture with approx. 99% cleanliness
  • Daily recommendations and automated updates sent

Our Goals

  1. Fix the internal link structure for a very large site (25 Million pages indexed) with over 50k links pointing to 404 pages
  2. Ensure all internal links point to canonical pages and reduce 301 redirects
  3. Design a dynamic, automated solution to provide ongoing updates to the internal architecture

What We Did

Expedia already had an existing internal link structure widget which was providing some level of internal linking, however the links were being rapidly outdated by changes in the site caused by ongoing business activity.

Built a Customised Web Crawler

The currently available crawler solution would not make the cut for this site.
We ended up upgrading our own customised web crawler to rapidly and efficiently crawl through the website to identify broken or underperforming links.

Working closely in regards to the business concerns of the client, we also built an automated logic engine which identified the best action to take with a broken link – to change its target, remove it, or to flag it for human attention.

Developed a Recommendation Engine

Using some of our existing Machine Learning solutions, we developed an algorithm capable of suggesting the most accurate replacement for an error page and turned a painstaking manual process into an automated one.

The internal link structure is now progressively kept up to date, with fix recommendations being issued every 24 hours.
This has brought the number of bad internal links down from tens of thousands to less than a thousand at any given time, and has considerably improved the valid-to-invalid Googlebot crawl ratio, meaning that the site is updated faster and is seeing improved organic results on search engines.

Connected Crawl Rate with Sales Numbers

The next step on this project was to start analysing the landing page conversion outcomes of pages, and influence the internal link structure to prioritise pages with better conversion outcomes, while maintaining the breadth of internal links necessary to ensure that crawlers can still find everything on the site, and that internal links keep high levels of relevancy.

Results

  • Over 50k links pointing to 404s removed
  • 25% of useless links removed
  • Designed a new architecture with approx. 99% cleanliness
  • Daily recommendations and automated updates sent
  • Integrated with historical system including manual overrides
  • Tied up to business metrics, increasing sales

Large scale internal linking check - results

The team at In Marketing We Trust, led by the very knowledgeable Frederic, has been working with Expedia since 2014 and has helped us uncover and fixed a number of on site issues across our 18 million page website. Their deep understanding of complex sites’ issues and pragmatic approach to enterprise SEO in an autonomous fashion helped me deliver results while allowing me to focus on some other matters.

I would highly recommend his skills & the team at In Marketing We Trust to anyone looking to succeed in Digital, and Search in particular.

Julien P., SEO Manager, Expedia