It's been a few years since I started to operate a service to return your IPv4 and IPv6 address. Although there are a bunch of other sites that offer this service as well, I've been amazed by the gradually increasing traffic to .
Here's a sample of the latest statistics:
Hits per day: 1.8 million (about 21 hits/second)
Unique IP addresses per day: 25,555
Hits per day from IPv6 addresses: 1,069 (a little sad)
Bandwidth used per day: ~ 400MB
The site is now running on multiple at behind a . In addition, the DNS records are hosted with Rackspace's service.
This should allow the site to reply more quickly and reliably. If you have suggestions for other improvements, let me know!
is a post from: Major Hayden's blog.
Thanks for following the blog via the RSS feed. Please don't copy my posts or quote portions of them without attribution.
My quest to get better at led me to create a new project on GitHub. It's called and it's ready for you to use.
Why do we need a JSON API for MySQL?
The real need sprang from a situation I was facing daily at . We have a lot of production and pre-production environments which are in flux but we need a way to query data from various MySQL servers for multiple purposes. Some folks need data in ruby or python scripts while others need to drag in data with .NET and Java. Wrestling with the various adapters and all of the user privileges on disparate database servers behind different firewalls on different networks was less than enjoyable.
That's where this bridge comes in.
The bridge essentially gives anyone the ability to talk to multiple database servers across different environments by talking to a single endpoint with easily configurable security and encryption. As long as the remote user can make an HTTP POST and parse some JSON, they can query data from multiple MySQL endpoints.
How does it work?
It all starts with a simple HTTP POST. I've become a big fan of the Python module. If you're using it, this is all you need to submit a query:
import requests
payload = {'sql': 'SELECT * FROM some_tables WHERE some_column=some_value'}
url = "http://localhost:5000/my_environment/my_database"
r = requests.post(url, data=payload)print r.text
The bridge takes your query and feeds it into the corresponding MySQL server. When the results come back, they're converted to JSON and returned via the same HTTP connection.
What technology does it use?
does the heavy lifting for the HTTP requests and wraps the module in something a little more user friendly. Other than those modules, and are the only other modules not provided by the standard Python libraries.
Is it fast?
Yes. I haven't done any detailed benchmarks on it yet, but the overhead is quite low even with a lot of concurrency. The biggest slowdowns come from network latency between you and the bridge or between the bridge and the database server. Keep in mind that gigantic result sets will take a longer time to transfer across the network and get transformed into JSON.
I found a bug. I have an idea for an improvement. You're terrible at Python.
All feedback (and every pull request) is welcome. I'm still getting the hang of Python (hey, I've only been writing in it seriously for a few weeks!) and I'm always eager to learn a new or better way to accomplish something. Feel free to create an issue in GitHub or submit a pull request with a patch.
is a post from: Major Hayden's blog.
Thanks for following the blog via the RSS feed. Please don't copy my posts or quote portions of them without attribution.
Today marks the fifth year that this blog has existed on the internet. I bought the domain on February 14th, 2007 and tossed together a quick WordPress installation (I can't even remember the version now!) to hold my notes that I was gathering at work.
Photo credit:
At the time, I had recently parted ways with a very small internet startup and joined the ranks at as an entry-level Linux system administrator. The abrupt change from «top dog at the startup» to «wow, I don't know anything about Linux» caught me by surprise and I was trying to stuff as much knowledge into my brain as quickly as I could. My teammates at Rackspace were eager to show me the ropes of wrangling servers and supporting customers.
As I mentioned already, the blog started out just as a place to stuff my notes from the things I learned at work. I figured that it would be nice to store it in a searchable format but it would also be great if I could link other people to certain posts if they needed more information to fix a problem. It was a way to retain knowledge but yet give it back to the people around me who needed it.
The blog has hit 456 posts (this one is #457) and it's gone from a few page views per day to just over 20,000 per day. Here are the top five most accessed posts (since I've been keeping stats):
I'd like to send out a big thanks to the people who read this blog, add comments (or complaints!), and suggest new topics. You are the reason why I take the time to keep this blog going.
is a post from: Major Hayden's blog.
Thanks for following the blog via the RSS feed. Please don't copy my posts or quote portions of them without attribution.
One of the handiest tools in the OpenSSL toolbox is s_client. You can quickly view lots of details about the SSL certificates installed on a particular server and diagnose problems. For example, use this command to look at Google's SSL certificates:
You'll see the chain of certificates back to the original certificate authority where Google bought its certificate at the top, a copy of their SSL certificate in plain text in the middle, and a bunch of session-related information at the bottom.
This works really well when a site has one SSL certificate installed per IP address (this used to be a hard requirement). With (SNI), a web server can have multiple SSL certificates installed on the same IP address. SNI-capable browsers will specify the hostname of the server they're trying to reach during the initial handshake process. This allows the web server to determine the correct SSL certificate to use for the connection.
If you try to connect to rackerhacker.com with s_client, you'll find that you receive the default SSL certificate installed on my server and not the one for this site:
$ openssl s_client -connect rackerhacker.com:443
Certificate chain
0 s:/C=US/ST=Texas/L=San Antonio/O=MHTX Enterprises/CN=*.mhtx.net
i:/C=US/O=SecureTrust Corporation/CN=SecureTrust CA
1 s:/C=US/O=SecureTrust Corporation/CN=SecureTrust CA
i:/C=US/O=Entrust.net/OU=www.entrust.net/CPS incorp. by ref. (limits liab.)/OU=(c) 1999 Entrust.net Limited/CN=Entrust.net Secure Server Certification Authority
Add on the -servername argument and s_client will do the additional SNI negotiation step for you:
$ openssl s_client -connect rackerhacker.com:443 -servername rackerhacker.com
Certificate chain
0 s:/OU=Domain Control Validated/OU=PositiveSSL/CN=rackerhacker.com
i:/C=GB/ST=Greater Manchester/L=Salford/O=Comodo CA Limited/CN=PositiveSSL CA
1 s:/C=GB/ST=Greater Manchester/L=Salford/O=Comodo CA Limited/CN=PositiveSSL CA
i:/C=US/ST=UT/L=Salt Lake City/O=The USERTRUST Network/OU=http://www.usertrust.com/CN=UTN-USERFirst-Hardware
2 s:/C=US/ST=UT/L=Salt Lake City/O=The USERTRUST Network/OU=http://www.usertrust.com/CN=UTN-USERFirst-Hardware
i:/C=SE/O=AddTrust AB/OU=AddTrust External TTP Network/CN=AddTrust External CA Root
3 s:/C=SE/O=AddTrust AB/OU=AddTrust External TTP Network/CN=AddTrust External CA Root
i:/C=SE/O=AddTrust AB/OU=AddTrust External TTP Network/CN=AddTrust External CA Root
You may be asking yourself this question:
Why doesn't the web server just use the Host: header that my browser sends already to figure out which SSL certificate to use?
Keep in mind that the SSL negotiation must occur prior to sending the HTTP request through to the remote server. That means that the browser and the server have to do the certificate exchange earlier in the process and the browser wouldn't get the opportunity to specify which site it's trying to reach. SNI fixes that by allowing a Host: header type of exchange during the SSL negotiation process.
is a post from: Major Hayden's blog.
Thanks for following the blog via the RSS feed. Please don't copy my posts or quote portions of them without attribution.
I sometimes enjoy living on the edge occasionally and that sometimes means I keep up with OpenStack changes commit by commit. If you're in the same boat as I am, you may save some time by using my repository of bleeding-edge Python packages from the OpenStack projects:
Python packages are updated moments after the commit is merged into the repositories under .
Although the packages will contain the latest code available, rest assured that the code has passed an initial code review (by humans), unit tests, and varying levels of functional or integrated testing. There may still be a bug or two cropping up after that, so be aware of that as you utilize these packages.
I was surprised to see coverage about on last Sunday and I was curious to know what effect the story would have on my site's overall traffic. wrote a great summary of what the site offers and how people can use it in their daily work. It's pretty obvious that icanhazip.com really only serves a niche group of internet users, but even I was surprised at the level of interest.
icanhazip.com traffic data — March 2011
The graph on the right shows some recent traffic data from March 2011. The Lifehacker story was published around 7AM on March 27th in Australia, so I first started seeing a spike on the 26th (my server's time zone is UTC-5). The yellow bar is a count of the unique visits while the other bars count page views, hits and total bandwidth.
The count of unique visitors certainly increased (by about 10-11x), but the overall hits didn't increase by much. I'd imagine that most visitors accessed the site, noticed that it displayed their public IP, and then they went on their way. As I've said before, this site is easy to re-create and will really only serve a niche segment of internet users.
On most days, I'll receive a very high number of hits from a relatively small number of unique IP addresses. There are quite a few people who check their public-facing IP address every second, but it seems like the majority stick to a more reasonable interval of 5-30 minutes. I've yet to find the value in checking my public IP address once per second, but there are obviously some folks out there who find it valuable (or they aren't good at implementing sleeps in their scripts).
Here's a bit of trivia about the site for those who are interested:
Almost 40% of the traffic to the site is from Eastern European and Asian countries
The average user on the site generates about 45 hits per day
Linux users make up 91% of the traffic on the site (based on user agent strings)
Over 88% of the hits to the site are requests made with curl or wget
Most traffic is received between 4-5PM CDT
Almost 98% of the visitors who reach the site do so via a direct link without a referrer
is a post from: Major Hayden's blog.
Thanks for following the blog via the RSS feed. Please don't copy my posts or quote portions of them without attribution.
If you offer a web service that users query via scripts or other applications, you'll probably find that some people will begin to abuse the service. My site is no exception.
While many of the users have reasonable usage patterns, there are some users that query the site more than once per second from the same IP address. If you haven't used the site before, all it does is return your public IP address in plain text. Unless your IP changes rapidly, you may not need to query the site more than a few times an hour.
I added the following to my icanhazip.com virtual host definition to get the message across to those users that abuse the service:
ErrorDocument403"No can haz IP. Stop abusing this service. \
Contact major at mhtx dot net for details."RewriteEngineOnRewriteCond %{REMOTE_ADDR} ^12.23.34.45$ [OR]
RewriteCond %{REMOTE_ADDR} ^98.87.76.65$
RewriteRule .* nocanhaz [F]
The users that are caught on the business end of these 403 responses will see something like this:
$ curl -i icanhazip.com
HTTP/1.1 403 Forbidden
Date: Wed, 17 Nov 2010 13:42:55 GMT
Server: Apache
Content-Length: 84
Connection: close
Content-Type: text/html; charset=iso-8859-1
No can haz IP. Stop abusing this service. Contact major at mhtx dot net for details.
As my uptime reports have shown, and as some of you have reported, my blog's load time has increased steadily over the past few weeks. It turns out that one of my VM's was on a physical machine that had some trouble and I was reaching a point where GlusterFS's replicate functionality couldn't meet my performance needs.
Instead of using as I had before in my , I decided to use in dual-primary mode with as the clustering filesystem on top of it. The performance is quite good so far:
Pingdom Response Time Graph for rackerhacker.com
I switched over the DNS late last night and the response time has fallen from the two to three second range (during times of low load) to right around one second per request. In addition to the reduced load times, I can support higher concurrency without significant performance degradation.
Don't worry — I'll make a detailed post on this topic later along with a guide on how to set it up yourself.
Today, on my 28th birthday, I'm finally delivering on a promise to my readers which I made about two months ago. I've on how to host a web application redundantly in a cloud environment. While it's still a bit of a rough draft, it should be a good starting point for those who haven't worked in virtualized environments before. Also, it may show some of the more experienced systems administrators a new way to do things.
The guide:
As always, if you find anything in the guide that needs improvement, I'm all ears.
As many of you might have noticed from my and my , I've been working with GlusterFS in production for my personal hosting needs for just over a month. I've also been learning quite a bit from some of the folks in the channel on . On a few occasions I've even been able to help out with some configuration problems from other users.
There has been quite a bit of interest in GlusterFS as of late and I've been inundated with questions from coworkers, other system administrators and developers. Most folks want to know about its reliability and performance in demanding production environments. I'll try to do my best to cover the big points in this post.
First off, here's now I'm using it in production: I have two web nodes that keep content in sync for various web sites. They each run a GlusterFS server instance and they also mount their GlusterFS share. I'm using the to keep both web nodes in sync with client side replication.
Here are my impressions after a month:
I/O speed is often tied heavily to network throughput
This one may seem obvious, but it's not always true in all environments. If you deal with a lot of small files like I do, a 40mbit/sec link between the Xen guests is plenty. Adding extra throughput didn't add any performance to my servers. However, if you wrangle large files on your servers regularly, you may want to consider higher throughput links between your servers. I was able to push just under 900mbit/sec by using dd to create a large file within a GlusterFS mount.
Network and I/O latency are big factors for small file performance
If you have a busy network and the latency creeps up from time to time, you'll find that your small file performance will drop significantly (especially with the replicate translator). Without getting too nerdy (you're welcome to read the ), replication is an intensive process. When a file is accessed, the client goes around to each server node to ensure that it not only has a copy of the file being read, but that it has the correct copy. If a server didn't save a copy of a file (due to disk failure or the server being offline when the file was written), it has to be synced across the network from one of the good nodes.
When you write files on replicated servers, the client has to roll through the same process first. Once that's done, it has to lock the file, write to the change log, then do the write operation, drop the change log entries, and then unlock the file. All of those operations must be done on all of the servers. High latency networks will wreak havoc on this process and cause it to take longer than it should.
It's quite obvious that if you have a fast, low-latency network between your servers, slow disks can still be a problem. If the client is waiting on the server nodes' disks to write data, the read and write performance will suffer. I've tested this in environments with fast networks and very busy RAID arrays. Even if the network was very underutilized, slow disks could cut performance drastically.
Monitoring GlusterFS isn't easy
When the client has communication problems with the server nodes, some weird things can happen. I've seen situations where the client loses connections to the servers (see the next section on reliability) and the client mount simply hangs. In other situations, the client has been knocked offline entirely and the process is missing from the process tree by the time I logged in. Your monitoring will need to ensure that the mount is active and is responding in a timely fashion.
There's a which allows you to monitor GlusterFS mounts via nagios that Ian Rogers put together. Also, you can get some historical data with .
GlusterFS 3.x is pretty reliable
When I first started working with GlusterFS, I was using a version from the 2.x tree. The Fedora package maintainer hadn't updated the package in quite some time, but I figured it should work well enough for my needs. I found that the small file performance was lacking and the nodes often had communication issues when many files were being accessed or written simultaneously. This improved when I built my own RPMs of 3.0.4 (and later 3.0.5) and began using those instead.
I did some failure testing by hard cycling the server and client nodes and found some interesting results. First off, abruptly pulling clients had no effects on the other clients or the server nodes. The connection eventually timed out and the servers logged the timeout as expected.
Abruptly pulling servers led to some mixed results. In the 2.x branch, I saw client hangs and timeouts when I abruptly removed a server. This appears to be mostly corrected in the 3.x branch. If you're using replicate, it's important to keep in mind that the first server volume listed in your client's volume file is the one that will be coordinating the file and directory locking. Should that one fall offline quickly, you'll see a hiccup in performance for a brief moment and the next server will be used for coordinating the locking. When your original server comes back up, the locking coordination will shift back.
Conclusion
I'm really impressed with how much GlusterFS can do with the simplicity of how it operates. Sure, you can get better performance and more features (sometimes) from something like Lustre or GFS2, but the amount of work required to stand up that kind of cluster isn't trivial. GlusterFS really only requires that your kernel have FUSE support (it's been in mainline kernels since 2.6.14).
There are some things that GlusterFS really needs in order to succeed:
Documentation — The current documentation is often out of date and confusing. I've even found instances where the documentation contradicts itself. While there are some good technical documents about the design of some translators, they really ought to do some more work there.
Statistics gathering — It's very difficult to find out what GlusterFS is doing and where it can be optimized. Profiling your environment to find your bottlenecks is nearly impossible with the 2.x and 3.x branches. It doesn't make it easier when some of the performance translators actually decrease performance.
Community involvement — This ties back into the documentation part a little, but it would be nice to see more participation from Gluster employees on IRC and via the mailing lists. They're a little better with mailing list responses than other companies I've seen, but there is still room for improvement.
If you're considering GlusterFS for your servers but you still have more questions, feel free to leave a comment or find me on Freenode (I'm 'rackerhacker').