How AngularJS let me down !!!


This all started when I was asked to build a simple website for a startup food blog. When I initially thought down on the architecture there was nothing much to decide on. The requirement was to have a simple page with some images, recipes that can be maintained easily and edited frequently. If this requirement was given to a consultancy organization I'm pretty sure they would start it with a CMS without a second thought. Well then again they didn't ask a consultancy organization. They asked me. So me being not very fond of CMS systems I decided to go down on the AngularJS path which I am already accustomed to. 

I was able to get the site hosted within days. I had a nice json data repository set up for the client to work with. Everything was going smoothly until I was asked this question. Why aren't my links working in social media sites as they should be?? !!!!!!!

So here goes the OMG moment. None of the social media crawlers liked the javascript in my pages. So obviously what all those crawlers were seeing was a div tag inside the DOM. It didn't come to my mind at all that this would happen when I started designing the project. So as I was battling the feeling of a few sleepless weekends that I might have to suck up in the near future, I was thinking of alternatives.

Do I finally give in and design a WordPress blog? No way am I gonna do that!. I have more pride than that. So I decided to do a URL rewrite and present a compiled html using PhantomJS to social media crawlers. So how did I do it?

If I see a user agent string of a crawler, I compile the site using PhantomJS and present the html to the crawler. Sounds simple but wait for it. The site was hosted in IIS and even though running PhantomJS and doing the rewrite would have been much easier in Node, due to some constraints that was not an option. So I started walking down the dark ally of setting up IIS url rewrite rules for the crawlers. 

If you haven't already guessed by now this is just the first part of the solution. Then, I started doing a proof of concept for the IIS url rewrite in github.




So to summarize...

The problem

I was creating an AngularJS application for a client when I hit a roadblock. The project needed the ability to support facebook Open Graph properties, pinterest rich pins, stumble upon links, google plus links. All the above were failing.

The reason

All the above mentioned crawlers do not execute javascript. So I was stuck between a rock and a hard place. :(. So the soution I designed was to do a url rewrite for all the requests coming from the above said crawlers to a page generated using PhantomJS. Simple right? not quite. Since I already had html5 type url rewrites set up etc it was pretty cumbersome to write all the URL rewrite logic for IIS. At one point I was not sure how the user agent strings actually looked like. So born was this project.

Purpose

This project is going to help me understand the user agent string that is used by the above mentioned crawlers and how to do url the rewrites for them.

Solution

I have done a URL rewrite for the above mentioned crawlers and have supported and tested the following rich object and schema handling platforms

http://schema.org/
http://ogp.me/ [facebook]
https://developers.pinterest.com/docs/rich-pins/overview/ [pinterest]

Comments

Popular posts from this blog

Nextcloud and PHP8

Setting up KDiff3 to work with TortoiseGIT

Nextcloud on Arch Linux (Encrypted System) [Part 01 - Preparation]