Decode HTML entities in the iOS

Decode HTML entities in the iOS

While working on one of our legacy projects, I came across interesting (Re: Code smell) code. We were getting HTML encoded strings in one of our content call. We just avoided to load this on UIWebView using

[web loadHTMLString:[HTML String] baseURL:[Project base URL]]

Pro tip : We usually leave set the baseURL field nil most of the time. However, there could be instances where you would want to utilize it. Say you have image file in your project named awesomeImage.png and you want to display it on UIWebView using HTML img tag as

<img src='awesomeImage.png'/>

You can easily do that by filling in baseURL parameter in the loadHTML method as follows :

NSString* bundlePath = [[NSBundle mainBundle] bundlePath];
NSURL* baseURLForBundle = [NSURL fileURLWithPath:bundlePath];
[webView loadHTMLString:@"" baseURL:baseURLForBundle];

Now you can easily load local image on the webview

Too much of a digression, but back to the future main point

The reason being this title was added as a child of UITableViewCell on the UITableView. I we thought adding UIWebView on the cell could be quite expensive operation in terms of scrolling as number of cells grow in size. Thus what I followed couple of years back is manually decode the string by keeping mapping of encoded entity to decoded value like,

&amp; -> &

I know, I know this is kind of terrible way to do it. Today while I was going over it, I realized that neither this nor UIWebView solution was feasible

NB : Out project targets towards iOS8 SDK

I was determined to fix this bug for good. Thanks to Google, after few seconds of searching I found this StackOverflow answer which I got working. So here's a gist how you can do it. (As quoted from the same answer)

NSString* htmlString = @" & & < > ™ © ♥ ♣ ♠ ♦";
NSData* stringData = [htmlString dataUsingEncoding:NSUTF8StringEncoding];
NSDictionary* options = @{NSDocumentTypeDocumentAttribute:NSHTMLTextDocumentType};
NSAttributedString* decodedAttributedString = [[NSAttributedString alloc] initWithData:stringData options:options documentAttributes:NULL error:NULL];
NSString* decodedString = [decodedAttributedString string];

And that's it. You get all the HTML entities converted to actual characters as pointed. I am not really sure performance hit for using NSAttributedString object. But it looked quite flexible on the UITableView using >100 records.

Guess I know what I am going to do next time I encounter HTML encoded string right from the remote API