Category Archives: Development

How to Migrate 3 MySQL Databases to Amazon RDS in 5 Minutes

I had a LAMT (Linux-Apache-MySQL-Tomcat) on Amazon EC2 and I wanted to move all remaining MySQL databases (3) to an existing Amazon RDS instance. This would allow me to shutdown the MySQL instance on EC2, freeing RAM for Tomcat and leveraging RDS automated backups for those 3 databases in case of a disaster. The databases to migrate only contain low volume TEST data, but I already have that RDS instance, so why not use it?

I also have a SLA with my clients that allows me to perform “Standard daily maintenance“. Basically, the “Standard daily maintenance” must be performed outside business hours and must last at most 15 minutes.

The key to the success of this migration was to prepare, prepare and prepare before the actual migration. So here is what I did before the migration:

  • Create 3 empty databases and users on the RDS instance
  • Prepare new configuration files with the new JDBC url pointing to the RDS instance
  • Prepare all the commands that must be executed
  • Review and test the commands as needed

Now that I am prepared I just need to wait for the “Standard daily maintenance” time. Then, I just copy and paste the commands in a terminal. I prefer to copy/paste the commands one by one, so if any command fails (for any reason), it can be fixed right away before running the next commands. Here is a summary of the commands:

  1. Shutdown Tomcat
  2. MySQL dump/restore from EC2 to RDS (3 databases)
  3. Copy the configuration files with JDBC url pointing to RDS (3 files)
  4. Prevent MySQL from starting during boot: echo manual > /etc/init/mysql.override
  5. sudo reboot (I want to verify that MySQL on EC2 won’t start after a reboot)

Everything went fine. After reboot, the MySQL instance on EC2 was not started, as expected. The Tomcat webapps were fine as well and there is more free RAM for Tomcat.

For those webapps, I have uptime monitoring at one minute interval with Pingdom. I receive emails when a webapp go down and up again. Here is the “UP” email from Pingdom, showing 5 minutes of downtime for that migration:

OpCode is UP again at 12/22/2014 08:25:36PM, after 5m of downtime.

 

ImageIO.read() causes OutOfMemoryError

The problem

Few days ago, a reader contacted me with a problem he had in a piece of code. The code was creating thumbnails from images and was throwing an OutOfMemoryError after few dozens of images.  Here is the simplified code, line 26 throws OutOfMemoryError:

And here is the stacktrace with the OutOfMemoryError:

The cause of the OutOfMemoryError

The cause of that problem is not obvious. The stacktrace (and the title of this article) is somewhat misleading. Here is what happens: The call to fullSizeImage.getScaledInstance() (lines 33-34) produces a smaller image thumbnail, but that thumbnail object keeps a reference to the fullSizeImage. Since JPEG files are highly compressed, reading and parsing them takes a significant amount of memory and that memory can never be freed.

The solution: Do not use Image.getScaledInstance()

The solution was to replace the call to fullSizeImage.getScaledInstance() with the lines 34-38 highlighted below. That solution allowed the code to read thousands of images, because the fullSizeImage was no longer kept in memory.

References

From Oracle’s website: How do I create a resized copy of an image?

For this particular problem, I did not need to produce a heap dump, because the code was small enough. With a few tests and a few searches on Google, I could figure-out what was happening. However, if you have no idea where the OutOfMemoryError comes from, you may want to read this article: How to fix java.lang.OutOfMemoryError: Java heap space

HTTPS Wildcard Subdomain: DNS + Apache + Tomcat Config

I recently had to configure HTTPS on a wildcard subdomain with Apache HTTP server as reverse proxy to a Tomcat backend. I had few more requirements:

  1. Redirect all http traffic to https and preserve the subdomain (hostname). For instance:
    • http://sub1.example.com/ -> redirect to -> https://sub1.example.com/
    • http://sub2.example.com/ -> redirect to -> https://sub2.example.com/
    • etc.
  2. I want to have a PHP wiki on the subdirectory /wiki and I want to send the rest of the traffic to Tomcat.
  3. Tomcat needs to know the subdomain (hostname) and will serve content accordingly.
  4. I don’t know the subdomains in advance because they are chosen by users, just like *.wordpress.com

Few parts were not trivial, so I will share my setup.

DNS Wildcard Subdomain Configuration

DNS is probably the easiest part. Nowadays, most domain registrar offer good DNS support for free with your domain. If that is not the case of your registrar, you may want to consider namecheap. Their DNS also support wildcard entries. Otherwise, you can use the popular Bind DNS server. Here is how to configure a wildcard entry in BIND. Change 55.55.55.55 with your IP address.

Apache Wildcard Subdomain Configuration

This was trickier. The Apache HTTP server configuration has 2 main parts.

1. HTTP (port 80): We use mod_rewrite to redirect all traffic for *.example.com to https (port 443) and we preserve the hostname with the %{HTTP_HOST} variable.

2. HTTPS (port 443): Except for the PHP subdirectory (/wiki), we reverse proxy all traffic to Tomcat, which listen on port 9090 of localhost.

Tomcat Wildcard Subdomain Configuration

Below you’ll find the configuration of Tomcat (in server.xml), which is quite standard. Actually, we do not need to define any “wildcard“, we just define a defaultHost in the Engine element. Then we deploy a ROOT.war in the webapps directory (/opt/example.com/tomcat7/webapps) to serve all content at the root context path.

How does Tomcat Know the Subdomain?

With this configuration, all content will be sent to Tomcat with “localhost” as the hostname. Fortunately, the Apache reverse proxy will send extra request headers to Tomcat, namely:

X-Forwarded-For: The IP address of the client.

X-Forwarded-Host: The original host requested by the client in the Host HTTP request header.

X-Forwarded-Server: The hostname of the proxy server.

So in Tomcat (or any other servlet container), just use the Java code below to get the value of that header “X-Forwarded-Host” and you’ll know the subdomain.

 

Alternatively, you may also use the ProxyPreserveHost On directive in Apache configuration and you should be able to get the hostname (subdomain) normally in Tomcat. NOTE: I haven’t tested that setup.

Bonus: Wildcard SSL Certificate

Of course you’ll need a wildcard SSL certificate. Those are usually very expensive, but Namecheap resells Comodo wildcard certificate at very good price. No, I do not have any interest in namecheap, they just happen to be very good at what they do 🙂

Ubuntu Apache Reverse Proxy Rewrite HTML Links

I just wasted few hours on this, so I will share a few tips. If you want to setup a reverse proxy and rewrite links in html pages, you can use Apache module mod_proxy_html.

Step 1. Install and enable Apache mod_proxy

 

Step 2. Apache configuration

In Ubuntu 14.04 LTS, it does not work “out of the box”, because some standard config is missing when enabling mod_proxy_html. More specifically, the ProxyHTMLLinks directives are missing in Ubuntu 14.04. I say “missing”, because those directives are included by default in earlier releases and in other distros (in a file called proxy_html.conf). Also, pay particular attention to the directives ProxyHTMLEnable, ProxyHTMLExtended and SetOutputFilter.

So, let’s say you want to have your apache server at http://host1.example.com/path1 to serve (proxy) the content of the server at http://host2.example.com/path2 and rewrite HTML links. Here is the config that works for me on Ubuntu 14.04 LTS.

 

Programmatically Configure Hibernate (JPA) with DBCP

I recently had deadlock issues with c3p0 and statement caching. Long story short, after investigating c3p0 code, I decided to switch to DBCP (maybe I’ll write a post with the long story).

I am not a big fan of Spring (here again, maybe I’ll write a post about that). If you are like me, here is how to programmatically configure Hibernate (JPA) to use DBCP, without Spring and without JNDI.

With DBCP, all my deadlock issues disappeared. Thank you ASF.

How to fix java.lang.OutOfMemoryError: Java heap space

If you get an OutOfMemoryError with the message “Java heap space” (not to be confused with message “PermGen space“), it simply means the JVM ran out of memory. When it occurs, you basically have 2 options:

Solution 1. Allow the JVM to use more memory

With the -Xmx JVM argument, you can set the heap size. For instance, you can allow the JVM to use 2 GB (2048 MB) of memory with the following command:

Solution 2. Improve or fix the application to reduce memory usage

In many cases, like in the case of a memory leak, that second option is the only good solution. A memory leak happens when the application creates more and more objects and never releases them. The garbage collector cannot collect those objects and the application will eventually run out of memory. At this point, the JVM will throw an OOM (OutOfMemoryError).

A memory leak can be very latent. For instance, the application might behave flawlessly during development and QA. However, it suddenly throws a OOM after several days in production at customer site. To solve that issue, you first need to find the root cause of it. The root cause can be very hard to find in development if the problem cannot be reproduced. Follow those steps to find the root cause of the OOM:

Step 1. Generate a heap dump on OutOfMemoryError

Start the application with the VM argument -XX:+HeapDumpOnOutOfMemoryError. This will tell the JVM to produce a heap dump when a OOM occurs:

Step 2. Reproduce the problem

Well, if you cannot reproduce the problem in dev, you may have to use the production environment. When you reproduce the problem and the application throws an OOM, it will generate a heap dump file.

Step 3. Investigate the issue using the heap dump file

Use VisualVM to read the heap dump file and diagnose the issue. VisualVM is a program located in JDK_HOME/bin/jvisualvm. The heap dump file has all information about the memory usage of the application. It allows you to navigate the heap and see which objects use the most memory and what references prevent the garbage collector from reclaiming the memory. Here is a screenshot of VisualVM with a heap dump loaded:

Heap Dump in VisualVM

This will give you very strong hints and you will (hopefully) be able to find the root cause of the problem. The problem could be a cache that grows indefinitely, a list that keeps collecting business-specific data in memory, a huge request that tries to load almost all data from database in memory, etc.

Once you know the root cause of the problem, you can elaborate solutions to fix it. In case of a cache that grows indefinitely, a good solution could be to set a reasonable limit to that cache. In case of a query that tries to load almost all data from database in memory, you may have to change the way you manipulate data; you could even have to change the behavior of some functionalities of the application.

Manually triggering heap dump

If you do not want to wait for an OOM or if you just want to see what is in memory now, you can manually generate heap dump. Here 2 options to manually trigger a heap dump.

Option 1. Use VisualVM

Open VisualVM (JDK_HOME/bin/jvisualvm), right-click on the process on the left pane and select Heap Dump. That’s it.

Option 2. Use command line tools

If you do not have a graphical environment and can’t use vnc (VisualVM needs a graphical environment), use jps and jmap to generate the heap dump file. Those programs are also located in JDK_HOME/bin/.

Finally copy the heap dump file (heap.bin) to your workstation and use VisualVM to read the heap dump: File -> Load…

Alternatively, you can also use jhat to read heap dump files.

Solution 3 (bonus). Call me

You can also contact my application development company and I can personally help you with those kind of issues 🙂

How to fix java.lang.OutOfMemoryError: PermGen space

When you get an OutOfMemoryError with the message “PermGen space” (not to be confused with message “Java heap space“), this means the memory used for class definition is exhausted. Fortunately, most of the time, this is easy to fix.

Solution 1 (your best bet). Increase the size of PermGen space

If you have a Java process that uses a lot of classes (lots of jars) or if you have many applications deployed to your application container (Tomcat), you can allocate more memory to that “PermGen space” using the -XX:MaxPermSize VM argument. For instance, to allocate 512 MB of RAM to PermGen space, use:

Solution 2. Restart your application container

You can get this error if you redeploy an application (webapp) several time without restarting your application container (like Tomcat). Most application containers support hot-redeployment, but class-loading is complex and sometimes old class definitions remain in memory. In that case, your best option is to get used to always restart your application container (Tomcat) after you deploy an application to it. This is easy and it fixes many problems.

Solution 3. Fix your class-loader leak

If none of the above works, you are in trouble 🙁 Seriously, unless you hacked the class-loading of the JVM or application container, you should not have that problem. Or maybe it is a bug in a library you are using or in your application container. You can try to upgrade to latest versions. If you hacked the class-loaders yourself, you may want to reconsider it. Why did you do that? Unless you are developing a JVM or an application container, you should not have to do that.

New website for my software development company

My application development company just got a new website. It shows more relevant information and has a lean design powered by Bootstrap. It is now hosted on Amazon EC2. If you need a software for your business, call us. We will discuss your project and we’ll give you a free quote, see our pricing and application hosting packages. All the applications we develop run on desktop, tablet and smartphone out of the box.

Alright, enough bragging, back to work now!

VisualVM slow with heap dump files

One great feature of VisualVM is that it can read heap dump files. Heap dumps are useful to diagnose memory leaks. See this post for more details about memory leaks and how to solve them.

Why VisualVM is slow with heap dump

Another great feature of VisualVM is that you can read a huge heap dump file and VisualVm will consume a minimal amount of memory to do so. For instance, you will be able to read a 8 Gigabytes heap dump file with VisualVM running on a development workstation having only 2 Gigabytes of RAM. In order to achieve that, VisualVM will parse the heap dump file and will create a work file on disk in the default system temp folder (/tmp by default on Linux). In theory that’s great, but in practice, VisualVM becomes painfully slow because it constantly have to do disk I/O’s to process the information.

This behavior is even more frustrating if you happen to have a server with 12 Gigabytes of RAM available for you. A simple solution for that is to create a ramdisk and tell VisualVM to use that ramdisk as the tmp folder.

The solution: use a RAMDisk

First, create the RAMDisk (tmpfs). Here I am on a linux development server and I create a tmp folder in my home. Then I create (mount) the ramdisk in the tmp folder I just created:

Then I launch VisualVM and I modify the java.io.tmpdir VM arg that tells VisualVM where the system tmp folder is.

Now VisualVM is much much faster and I can investigate and find the root cause of that memory leak much faster.

kill -3 is your friend

One nice feature of Java runtime is when you send the QUIT signal to a Java process, it outputs the full thread dump to stdout. To send the that signal, just open a terminal and type:

kill -QUIT <pid>

or

kill -3 <pid>

Where <pid> is the process Id. This does not terminate the process; all threads will continue doing what they were doing.

That feature can be very useful when the application seems to freeze or when you have a very intermittent issue (intermittent deadlock). With the full thread dump, you can see what every thread was doing at that particular moment. So in case of a deadlock, you will be able to see what monitors and what threads are involved.

This can also be helpful to diagnose performance bottlenecks. Suppose you are load testing an application and it does not deliver the expected throughput, but the CPU usage is not the problem. For instance, with kill -3 you will notice right away that the size of the jdbc connection pool is not big enough and all threads are waiting on it for a connection to free.