The Land of Fire, Ice and Very Expensive Soup

We’re just back from 4 nights in Reykjavik, which was an absolute blast. We did the usual suspects with trips to the Golden Circle and the Blue Lagoon. We weren’t so lucky with the Northern Lights, although we did get to see a murmuring on our second last night. The long exposure below [15 seconds, F4, ISO 800] picks out a lot more detail than was visible with the naked eye:

Northern Lights

Northern Lights

We knew from the weather forecast that there wasn’t really much point in booking an organised tour [and we really only had two nights to play with]. This was actually on a night when the tours had been cancelled. The composition is dire, but it was the darkest spot that we could find around the Old Harbour, and it was blowing a gale.

A few other obligatory waterfall / geyser type shots included below. Oh, and just to explain the title: on the Golden Circle, we called in for lunch at the facilities by Geysir, where you suffer the consequences of being a captive audience. The fish soup we had for lunch was delicious, but £28 for two bowls was almost as memorable as the scenery!

Gullfoss

Gullfoss

A Geyser; not the Geysir

A Geyser; not the Geysir

Where gloves go to die

Where gloves go to die

pfSense: Adding a Second LAN

While this is undoubtedly a beginner’s question, it’s one that I spent most of yesterday wrestling with. I also really struggled to find information on it: how to add a second LAN.

There are plenty of ways of achieving what I want on my network – to subdivide it between devices I trust, and ones I don’t [or at least trust less, such as my IP camera]. The new machine I got to run pfSense on has 4 network interfaces, so I decided to run two LANs straight off the adaptors [leaving one spare for a possible future experiment with IoT nonsense].

Adding the interface is well documented, as is the ‘default allow’ you’ll need to set in the firewall rules. What you also need to do is to configure a DHCP server, which is under the Services menu in the WebGUI. You’ll see there is an entry already configured for the first LAN, which you can use to figure out the settings. Obviously, this assumes that you configured the LAN for DHCP during the setup, which almost everyone is going to want to do.

I set an address range of .2 – .254, and then configured both the DNS server and the Gateway on .1. You’ll also have to set the ‘enable DHCP’ checkbox at the top, which is disabled by default.

While it’s obvious in retrospect, I went in completely the wrong direction, thinking it was something to do with routing rules. Routing is all well and good, but I was never going to get very far without an IP address :).

pfSense on a Celeron J1900

I spent the weekend setting up pfSense on a new piece of kit that I got last week. A reseller on Amazon is selling a bare bones box with 4 ethernet ports, a Celeron J1900 processor, 2Gb of RAM and a 64Gb SSD for £170, which I thought would be perfect for the task. It’s probably a little over-spec’ed if anything, but it’s a really lovely piece of kit.

I spent the entirety of the install process navigating various options in the Aptio BIOS interface. I had two issues. The first was pretty trivial, which was setting the boot order via ‘HD BBS Priorities’. The option above, ‘Boot Option’ seemed like the more likely but didn’t have any effect.

The second took me hours to figure out. I could see lots of Google hits for BSD installs and runtime issues, but nothing that fit the problem that I was having. During the install a command called bsdlabel hung, and then returned an error, ‘WRITE_EPDMA_QUEUED’, followed by ‘CAM status: command timeout’.

To cut a [very] long story short, I fixed it by setting SATA CONFIG -> SATA Mode -> IDE Mode.

It took me so long to get the install working I’ve not had a chance to play with pfSense itself yet, other than to prove it’s working. One immediate challenge I’ve yet to figure out is how to access the web interface if it’s ‘north’ of a wireless access point….

IP Camera Data Privacy

I’ve had a few goes at setting up the Motion package on my Raspberry Pi, but I’ve finally abandoned it. In its stead, we recently bought an IP camera manufactured by a company called Annke which, at the time of writing, is among the best selling surveillance cameras on Amazon. It’s a nice piece of kit but given that it only cost £40 and has a lot of moving parts, it’s not one that I expect to survive down the years.

I thought it would be interesting to proxy the traffic on my phone to see what’s happening.

My iPhone isn’t jailbroken, which would have been a showstopper if the server that the app is talking to was using certificate pinning. It’s not. The first call is over plain HTTP. I’m not going to copy it here, because some of the payload is decodes to binary, and there’s a possibility that I might be broadcasting my own password. Doh!

So the first call is a GET to a server running on Amazon’s cloud service, listening on port 7080. I checked the IANA registry: while there is something assigned to that ‘officially’ [some identity management software called empowerid] I think it’s a coincidence, and it’s probably just a web server of some kind running on a non standard port. The response doesn’t report back the server software name. Included in the GET parameters, there are a series of comma and slash [url encoded] separated parameters which are base64 encoded. These decode into binary, and could be anything. Included among the readable parameters is my username.

The response back is a block of JSON, referencing different URLs on the same server, which a geolocation service reliably informs me is in a data centre operated by an outfit called OVH in Roubaix, northern France. The URLs have helpful prefixes: ‘signal’, ‘debug’, ‘ping’, and ‘ntp’ among them. Not all of the URLs are referring to web traffic: there’s one reference to telnet, which is a blast from the past, and another called ‘binnet://’ which is sufficiently non standard that Google keeps insisting on telling me about ‘bonnets’ :). That final ‘binnet’ URL refers back to the original AWS server.

The app then does a second GET to a server, this time in Tampa, Florida. I’m not going to break this one down in any sort of detail because the server refuses the connection, so it can’t be too important!

Next, the app opens a TLS connection to the French server, and does 10 separate GETs. The 3rd of these includes my password, which I registered on first run.

Here’s what I imagine is happening: the camera is going to be polling the server in France cyclically, asking the question, ‘do I need to transmit to you yet?’ When I connect to the same server via the phone app, the answer comes back as a ‘yes’. The camera starts to transmit, and the server then relays the stream back to my app. WireShark should be able to give me some pointers, but I’m running out of time to look at it today. If I find anything interesting or contradictory when I do get round to looking at it, I’ll do a separate post on it.

By the way, the app seems to be using JavaScript to instantiate the video stream in HTML5. Apple have a video from the WWDC in 2013 on exactly this topic.

So in summary, it looks like the video stream of our back garden / the cat / my wife and I occasionally waving at the camera ends up in France, with the server there ‘joining’ the connection from the camera to the app.

The end state with my Raspberry Pi was, well, while not necessary secure in and of its own right, certainly wasn’t nearly as ‘mediated’, shall we say. I set up our broadband router with a port forwarding rule and configuration for a DDNS service. That meant we could connect from our phones to the web server integrated into the Motion package. That was all over vanilla HTTP – hey, at least I set a 401 password!

While I can’t do anything about how the camera operates – a trade-off I’m willing to make based on pure utility – I’m probably going to take a look at partitioning off the network with a proper firewall behind the broadband router [which can be configured to operate as a modem only]. I’ll put all of the ‘less trusted’ devices on their own little segment. Pfsense seems to be the way to go.

Cheerio iOS, Hello Android. For now…

I’ve decided to let my Apple Developer subscription lapse in a couple of weeks. I wavered over the decision when I realised that my apps would be disappearing off the App Store. Before they go, I did a little digging on iTunes Connect to see how they’ve done. Badly: a combined total of 2.15k downloads since I released WeighMe in December 2012.

Despite the paltry totals – oh, and let’s not forget my $24 in sales! – I was impressed to see that the apps have been downloaded in a total of 77 countries. So, for my solitary users in Armenia, Ghana, Paraguay and Uzbekistan: thanks for the downloads, folks!

I want to learn Java so my next handset, when I upgrade some time around the end of the year, will be running Android. If the iPhone 7 is released without an audio jack [as is heavily rumoured], I’d jump ship regardless. I have enough chargers and whacky adapters to last me a lifetime without adding more for headphones.

All of which is a lovely story, but when I tried Android about three years ago, I hated it. And if Apple do buck the trend with the audio jacks, the rest of the manufacturers will probably follow suit. So I may quietly return to the fold and pretend that I’ve never been away :).

How does the Twitter app know my location?

[Edit: the answer to this is actually very simple. I’d been focusing so much on the various mobile frameworks that I forgot about the obvious one: it’s my source IP. There are services out there that take your IP address and, based on service provider information, turn it into a location. The same mechanism is going to be available whether you are using your browser – and where you will see commonly see location specific ads – or if you’re using an app.]

OK, maybe not the most thrilling of titles, but I’ve been interested in location services pretty much since I started developing for iOS, and something happened this morning while using the Twitter app that piqued my interest.

I followed a link to a site, which popped up an embedded browser – so a UIWebView. At the bottom of the article, amid the usual rubbish, was an link to another site telling me how a millionaire in my named home town was earning an unlikely amount working from home.

The thing is that in my Location Services settings, I have the Twitter app set to ‘never’. There are a couple of other possible candidates. I was wondering if the UIWebView was inheriting some settings from Safari, but I have the Safari Websites setting on ‘never’. Also, the call to start the Location Manager happens in the calling app – so the corresponding privacy setting should be in the context of that app.

Looking at the other System Settings under Locations Services, there’s one other candidate: iAds. I’ve not used this in my own apps, but I’ve just checked: they are views embedded in native apps, not in UIWebViews. And anyway, I have the setting disabled.

There are a few other System Settings that I have set to ‘on’, such as WiFi Networking and Location-Based alerts, none of which should have anything to do with the Twitter app.

So what’s going on? Wild conspiracy theories aside, I can’t understand how the app could be getting my location when the primary privacy setting for the app is ‘never’.

PDF to Text Conversion

This project combines a couple of well trodden paths: PDF to text conversion, and then running an app in the background with audio playback. It introduced some new concepts to me and, based on a trawl of the usual resources for problem-solving, at least a couple of issues that are worth recording.

The TL;DR version is that PDF parsing gets into pretty complicated territory – unless you happen to know C well. There are open source libraries out there, but they didn’t hit the mark for me. I’ve implemented my own parser which is crude, but works. More or less!

Any PDF manipulation in iOS is going to depend on the Quartz 2D library somewhere along the line. Whether you call it directly or rely on another API that wraps it is a matter of choice. I looked at a couple. PDFKitten has a lot of functionality and seems to be by far the most sophisticated open source library but the documentation didn’t cover the simple requirement that I had – text extraction. There’s another one called pdfiphone that I struggled to get to work, and which epitomises the main challenge that I had with this project: I have only a rudimentary knowledge of C, which is what you’re getting into with Quartz.

So the basic structure of a PDF is a series of tags associated with different types of payload. You break the document down into pages, and process each page as a stream, calling C functions associated with the tags. This is a simple adaptation of example code straight from the Quartz documentation:

for (NSInteger thisPageNum = 0; thisPageNum < numOfPages; thisPageNum++)
{
   CGPDFPageRef currentPage = CGPDFDocumentGetPage(*inputPDF, thisPageNum +1);
   CGPDFContentStreamRef myContentStream = CGPDFContentStreamCreateWithPage (currentPage);
   CGPDFScannerRef myScanner = CGPDFScannerCreate (myContentStream, myTable, NULL);
   CGPDFScannerScan (myScanner);
   CGPDFPageRelease (currentPage);
   CGPDFScannerRelease (myScanner);
   CGPDFContentStreamRelease (myContentStream);
   CGPDFOperatorTableSetCallback(myTable, "TJ", getString);
}

In the last line I call my own C function ‘getString’ when the stream encounters the tag “TJ”. Here’s the first part that was new to me: the blending of C and Objective C. My function call, which is an adaptation of code I found here, looks like this:

void getString(CGPDFScannerRef inScanner, void *userInfo)
{
   CGPDFArrayRef array;
   bool success = CGPDFScannerPopArray(inScanner, &array);
   for(size_t n = 0; n < CGPDFArrayGetCount(array); n += 1)
   {
      if(n >= CGPDFArrayGetCount(array))
      continue;
      CGPDFStringRef string;
      success = CGPDFArrayGetString(array, n, &string);
      if(success)
      {
         NSString *data = (__bridge NSString *)CGPDFStringCopyTextString(string);
         [globalSelf appendMe:data];
      }
   }
}

So there a couple of things going on here: this code is simply looking for a string – well a Quartz style CGPDFStringRef – in the payload passed in by stream process. If it finds one, it converts it into an NSString via some ‘bridge casting’ – something I’ve come across before in working with the keychain, and which you need for ARC compliance. I then take that string and append it to a property in a local method called appendMe.

It’s not possible to call ‘self’ from a C function. There are a number of possible ways around this, some of which get pretty nasty. The most elegant that I found was this:

static P2VConverter *globalSelf;

-(void)setMyselfAsGlobalVar
{
   globalSelf = self;
}

…which assigns an instance of the class that I created to do the PDF processing to a static variable called *globalSelf, and which I can then refer to as an alternate to self. To say this implementation isn’t particularly memory efficient is an understatement – but it works.

There is a rich set of tags defined by a published PDF standard – all 800 pages of it – that tell whatever is rendering the document what to do with it. The best general explanation I found was this. There is a relatively small set of text related tags and TJ seems to be the simplest. It’s also the only one that I was able to adapt from other examples. I may come back to this again.

The way I tested this was to convert an HTML page into a PDF using Safari. The more complicated the text structure in your input document – say multiple columns per page, text boxes etc – the worse this simple extraction mechanism is going to cope.

On to email based importing of files. This isn’t something that I’d ever looked at before and it turned out to be a little more complicated than I expected. The amendments to the info.plist are pretty trivial, creating the association between the file type and the app. So in the PDF reader, when you launch the app in the contextual menu, what actually happens is that the file is copied to the app’s Documents folder, and a file:// style URL which points at it is passed to the AppDelegate – specifically, the application:(UIApplication *)application handleOpenURL method. I’d assumed in the first instance, that I’d import the header for the viewcontroller into the AppDelegate – and this is a single view app, so ViewController.h – instantiate the VC, call a method I expose in the header and I’d be done:

ViewController *thisVC = [[ViewController alloc] init];
[thisVC importFromEmail:url];

This is wrong, and led to some peculiar side effects, which emerged when I started to try to set the point in the text to resume speech to reflect a change in the scrubbing control. This is what I came up with, which is a variant of this, adapted for a single view app:

UIStoryboard *storyboard = [UIStoryboard storyboardWithName:@"Main" bundle:nil];
UINavigationController *root = [[UINavigationController alloc]initWithRootViewController:[storyboard instantiateViewControllerWithIdentifier:@"EntryVC"]];
self.window.rootViewController= root;
ViewController *thisVC = (ViewController*)[[root viewControllers] objectAtIndex:0];
if (url != nil && [url isFileURL]) 
{
   [thisVC importFromEmail:url];
   NSLog(@"url: %@", url);
}

A couple of points of note. First, the method being called here is going to run before  viewDidLoad or viewWillAppear. I normally do various inits in viewWillAppear, so I put them in a method that I call immediately in importFromEmail. Second, the string value for instantiateViewControllerWithIdentifier needs to be set in the storyboard.

Apart from the nasty callouts to C, what I spent most of my time working on scrubbing control functionality. I’ve created an IBAction method that will be called when the scrubber moves position. In the storyboard, I set the maximum value of the control to be 1, so to get the index of the new word position after dragging the control, I multiply the fraction that moving the control allows me to reference in the IBAction by the length of the original pdf string length.

Having stopped – not paused; more on that in a second – the playback, I then start a new playback in the IBAction method for the play button, having created a new string based on a range: the word index from the scrubber control as the start point, and then the length by subtracting that from the original pdf string length. There was a little bit of twiddling necessary to support this so that it would work when called multiple times.

Part of the reason I took this approach was because the continueSpeaking method on the AVSpeechSynthesizer class didn’t seem to work. This was because I was using stopSpeakingAtBoundary instead of pauseSpeakingAtBoundary – something I’ve just noticed. Doh!

This has a knock-on effect, which is that the play button has to be stateful, with a continue if the pause was pressed, or a restart with the new substring if it’s because of the setting of the scrubber control. Given that the actual quality of the string conversion is pretty basic, the fix for this exceeds the usefulness of the app.

A couple of final comments. I discovered that it’s best to keep the functionality in the IBAction method called for the scrubbing control changes pretty simple: basically just setting the property for the new word position. I got some peculiar results when trying to do some string manipulation for a property, because the method was being called before a prior execution to set it was completed.

Lastly, I encountered an odd, and as yet unresolved bug when, as an afterthought, I added support for speech based on text input directly into a UITextField. I started seeing log errors of the type _BSMachError: (os/kern) invalid name (15). This appears to be quite a common one, as the number of upvotes on this question attest to. I’m filing it in the same bucket as the stateful play button resolution: if the quality of the playback warranted it, I’d figure it out.


I was in two minds as to whether or not to write this app up given the mixed results, but I thought I might as well as it may be one of my last pieces of Objective C development.

I blogged a year ago, almost to the day, about my first Apple Watch app. I had one pretty serious watch project last summer, which I ended up canning around November. The idea was to sync accelerometer data from the watch with slow motion video, say of a golf swing. However, I ran into a problem with the precision of the data in AVAsset metadata, and which I have an as-yet unanswered question on in StackOverflow.

I also ran into the same usage issues with the watch that have been widely reported in the tech press. While I really liked the notifications [and the watch itself, which is a lovely piece of kit], the absence of any other compelling functionality barely warranted the bother of charging the thing. The borderline utility really came to the forefront for me when I was travelling, for both work and holidays, with no roaming network access. The watch stayed at home.

I also think it’s pretty telling that I bought precisely zero apps for it before I sold it a couple of months ago.

I’ll be looking to replace my current iPhone 6 Plus later this summer and I’m toying with the idea of moving to Android. I tried it before and hated it, but that was with a pretty low-end phone that I got as a stopgap after my 4S had an unfortunate encounter with the washing machine. Java would be much more useful than Objective C in my working life, and it’s a potentially interesting route into the language.

I’m sure it’s nothing personal…

Like probably the vast majority of people who are running WordPress for more than a few months, my site is frequently being hit with automated attacks. I’ve only recently noticed this in my logs so I thought it would be interesting to have a closer look.

Around the turn of the year, for reasons I can’t recall, I happened to look at the raw access logs and noticed a lot of references to ‘xmlrpc.php’, which look like this:

142.4.4.190 - - [31/Jan/2016:18:13:42 +0000] "POST /blog/xmlrpc.php HTTP/1.0" 200 58043 "-" "-"

This is a real log file entry, and is a classic example of an XMLRPC bruteforce amplification attack: someone has posted 58k at this page, to try and bruteforce the admin password. I disabled the mechanism – and just verified that it’s working this morning [two months later :)], as the 200 server response is a bit more polite than I would have expected.

At the same time I installed [yet another] plugin, which rate limits failed admin password authentication attempts. It started triggering last week with repeated admin authentication failures from a machine in Hanoi. In my latest access log file [31st January to about half an hour ago], I have 1500 POST attempts which look like this:

123.30.140.199 - - [26/Feb/2016:13:37:47 +0000] "POST /blog/wp-login.php HTTP/1.0" 200 3766 "-" "-"

I’ve not paid much attention to log formats in a long time so I had to google what those final two hyphens are: a blank referer [note to my wife on the spelling :)] and user agent field respectively. The blank user agent is indicative of some sort of automated attack and, by virtue of the fact that the person who’s running it hasn’t even bothered to make it look like a real browser, one that isn’t particularly sophisticated.

The logging pattern suggests what you’d expect: someone has harvested a set of servers that are running WordPress [how? by virtue of having the common pages that WordPress hosts. So a 200 in response to a GET for a ~/wp-login.php page, for instance], and is stepping through them.

This is another indicator of the lack of sophistication:

123.30.140.199 - - [26/Feb/2016:16:41:35 +0000] "POST /blog/wp-login.php HTTP/1.0" 200 1643 "-" "-"
123.30.140.199 - - [26/Feb/2016:16:41:37 +0000] "POST /blog/wp-login.php HTTP/1.0" 200 1643 "-" "-"
123.30.140.199 - - [26/Feb/2016:16:41:38 +0000] "POST /blog/wp-login.php HTTP/1.0" 200 1643 "-" "-"
123.30.140.199 - - [26/Feb/2016:16:41:44 +0000] "POST /blog/wp-login.php HTTP/1.0" 200 1643 "-" "-"
123.30.140.199 - - [26/Feb/2016:16:41:46 +0000] "POST /blog/wp-login.php HTTP/1.0" 200 1643 "-" "-"
123.30.140.199 - - [26/Feb/2016:16:41:58 +0000] "POST /blog/wp-login.php HTTP/1.0" 200 1643 "-" "-"
123.30.140.199 - - [26/Feb/2016:16:41:59 +0000] "POST /blog/wp-login.php HTTP/1.0" 200 1643 "-" "-"
123.30.140.199 - - [26/Feb/2016:16:42:05 +0000] "POST /blog/wp-login.php HTTP/1.0" 200 1643 "-" "-"
123.30.140.199 - - [26/Feb/2016:16:42:06 +0000] "POST /blog/wp-login.php HTTP/1.0" 200 1643 "-" "-"
123.30.140.199 - - [26/Feb/2016:16:42:13 +0000] "POST /blog/wp-login.php HTTP/1.0" 200 1643 "-" "-"
123.30.140.199 - - [26/Feb/2016:16:42:14 +0000] "POST /blog/wp-login.php HTTP/1.0" 403 9 "-" "-"
123.30.140.199 - - [26/Feb/2016:16:42:20 +0000] "POST /blog/wp-login.php HTTP/1.0" 403 9 "-" "-"
123.30.140.199 - - [26/Feb/2016:16:42:21 +0000] "POST /blog/wp-login.php HTTP/1.0" 403 9 "-" "-"
123.30.140.199 - - [26/Feb/2016:16:42:22 +0000] "POST /blog/wp-login.php HTTP/1.0" 403 9 "-" "-"
123.30.140.199 - - [26/Feb/2016:16:42:23 +0000] "POST /blog/wp-login.php HTTP/1.0" 403 9 "-" "-"

What’s happening here is that some software I’m running is blocking the user’s IP address after 10 authentication failures, shown by the 403, which is the server returning a ‘Forbidden’. What I’ve deleted from the log extract above is  that there are a total of 25 Forbidden responses by the server in a row: the attack software isn’t checking the server response codes, which is a waste of resource on their part.

I’ve had a bit of a trawl through my logs and am seeing similar, albeit less determined attacks like this, coming from all sorts of far flung places:

62.109.19.98 - - [13/Feb/2016:07:46:48 +0000] "POST /blog/xmlrpc.php HTTP/1.0" 200 58043 "-" "-"

That’s another XMLRPC bruteforce amplification attack, from Russia. A geolocation site reckons this one…

204.232.224.64 - - [12/Feb/2016:07:12:33 +0000] "POST /blog/xmlrpc.php HTTP/1.0" 200 58043 "-" "-"

…is in San Antonio, Texas. Interesting that the byte sizes being posted through are identical: 58,043. Again, that’s indicative of the same off the shelf attack software running with a pre-canned payload. Let’s do one more of these:

1.83.251.239 - - [11/Feb/2016:02:19:14 +0000] "POST /blog/xmlrpc.php HTTP/1.1" 200 45387 "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)"

I can honestly say that since I first started messing around on the internet in 1992, I’ve never seen an IP address that starts with 1. The geolocation service dutifully informs me that the machine that sent this parcel of good intention is located in Xi’an in China. At least they’ve spiced things up a bit with a different sized payload.

So here’s a thing: I have a couple of blog posts on this site about a holiday we had in Vietnam. I blogged about a holiday to China that included a trip to Xi’an. I’ve also got a posting about a work trip to Russia. So… Russia and China are massive, populous countries. But Xi’an, in China? That looks like a pattern to me. I wonder if the bundle of joy – malware, whatever it is – that would be deposited on my site if it were to be compromised is tailored or localised in some way or other, based on the occurrences of those locations.

 

As per the title, and the obvious lack of finesse, I know that my server is just one on what’s probably a very long list of candidates that these automated attacks are hitting. WordPress has had something of a chequered history from a security point of view: it’s a natural target. While I’ve done the easy stuff to shore it up – like blocking a blank user agent – the options are relatively limited. That’s fine, given the fairly low-rent nature of the stuff being thrown at it, but I’d really prefer not to be distributing malware to people. Migrating off WordPress looks like it would be a pain so if the ancillary approaches start to look like they’re too much trouble I’ll just delete the site.