Keyword

Κ2 heavy load on large database, gateway timeouts

More
8 years 6 months ago #148324 by Nick
Hello to all the community

Since sometime (about 2 years ago) i built a joomla website based mainly on K2 for all the articles. The site is based mainly in news and political content and it has high activity, since there are posted at about 10-15 articles a day. At about 2-3 months ago some problems appeared in the stability of the site, since at frequent times the server was unable to handle the load and we was facing "503 temporary service unavailable" problems. In the beginning we thought these problems was due to high visitor count, since we had at about 4-5.000 unique visitors a day. But at about 2-3 weeks ago the problem worsen so much that the web host forced to disable the database since it was making huge database queries and it was affecting the nearby servers and almost the whole network. Initially we was at a shared hosting.

They told us that due to very high activity we should move better to a VPS in order to handle the high load of visitors (about 7-8000 unique visitors a day) and that the database (almost 2gigabytes) was very huge to work properly. So at last we moved to a VPS. But after the migration nothing changed, the same problem appeared and in the new server. Initially we thought that the resources wasn't enough, so we upgraded temporarily, actually they doubled the RAM of the server, but again the same problem was persisting. From the host they told me that the problem was the SQL server and the database was making large queries that was unable to complete so it was overloading, the same time HDD activity was at about 50-60Mb/s and the server was unable to make anything else so the website was always down and unavailable. Here you can see the activity from the server:



I told them to send me the large query that made the problem and it is the following:

SELECT a.*, cc.name AS cattitle, cc.id AS categoryid, cc.alias AS categoryalias, cc.params AS categoryparams, u.username as username, u.name as realname FROM xxkw9_k2_items a LEFT JOIN xxkw9_k2_categories cc ON cc.id=a.catid
LEFT JOIN xxkw9_users AS u ON u.id = a.created_by WHERE a.published =1 AND a.trash = 0 AND cc.trash = 0 AND cc.published = 1 AND ( a.publish_up = '0000-00-00 00:00:00' OR a.publish_up <= '2015-10-10 19:53:56' ) AND ( a.publish_down = '0000-00-00 00:00:00' OR a.publish_down >= '2015-10-10 19:53:56' ) AND cc.id IN (8,9,10,11,12,13,14,17,15,21)

I forgot to mention that we have about 10.000 items in K2 articles and almost 4 to 5.000 tags. After those facts i forced to move the site in a subfolder, and magically there it was functioning properly and fairly fast. When i was moving the site in the root folder, we had the same problems with the overload. I even disabled all the plugins and the modules in the frontpage and i left only the articles. But nothing changed.
From other similar cases i read that maybe the tags was the problem, because i saw a large query with tags, you can see it here:

codepaste.net/r55o6r

So i read here that with a minor change in the code, we can disable the IDs from querying the database. Here is the post:

www.joomlaworks.net/forum/k2-en/42833-k2-tags-causing-slow-mysql-queries

So i disabled the IDs but again nothing changed, i saw some small improvements but the initial problem persisted. By the way here are the queries at a whole, through the debug system of joomla:

codepaste.net/q4ffiw
codepaste.net/r55o6r
codepaste.net/5rojbi

You can spot that there are a large number of queries that they are repeated 3,4 or 5 times and i don't know the reason.

As a temporary measure, as i told you early i moved the website in a subfolder, and through htaccess i redirected all the traffic there with this code:

RewriteEngine On
RewriteRule ^(/)?$ subfolder [L]

In the beginning it seems it was working, but again we had the same problem. So i made a single php page with a redirection to the subfolder:

<?php
Header( "HTTP/1.1 301 Moved Permanently" );
Header( "Location: www.example.com/subfolder" );
?>

This time things was better, the website was working, but again at frequent manner we had again the overload and the site was unavailable. Here is the past 12 hours from today:



In the blue peaks the whole server is unavailable and no one can see the website.
At the peak load the CPU was also very high, here are some numbers

Last 1 minute 20.95
Last 5 minutes 31.52
Last 15 minutes 28.09

In order for the server to work seamlessly these numbers must be under 5.00
From the technical stuff of the host they did some changes, i.e they changed the server from Apache to Nginx, but they couldn't do much more, they told me that the problem is the large database, only one table is about 1 Gigabyte and the database as a whole nearly 2, and the problem was in K2 so i must look there.

Please guys help me, for about a week or so i am dealing with this problem and i can't find a permanent solution, only temporary fixes but the problem is here, and you can imagine the visitors that see all these problems, all the past weeks. When there is an overload i am forcing to restart the VPS in order to calm down from the queries. Some people told me to delete the tags and see what is happening and if the database shrinks. But this isn't a good solution since the guy that manage the webpage, needs them.

If you can propose anything i would be glad.
Thanks in advance...

Please Log in or Create an account to join the conversation.

  • Krikor Boghossian
  • Krikor Boghossian's Avatar
  • Offline
  • Platinum Member
More
8 years 6 months ago #148393 by Krikor Boghossian
Replied by Krikor Boghossian on topic Κ2 heavy load on large database, gateway timeouts
The issue in the post you saw was located in the mod_k2_tools module (tag cloud). Are you using this module?

Also changing from Apache to Nginx was a really good move, since Nginx can handle way more more concurrent requests and sessions.

In your position since throwing money into larger and larger VPSs' is not a solution I would enable caching and use a CDN.
I have heard excellent things about fastly. Maxcdn and cloudflare are also reliable CDNs.

JoomlaWorks Support Team
---
Please search the forum before posting a new topic :)

Please Log in or Create an account to join the conversation.

More
8 years 6 months ago #148417 by Nick
Hi Krikor and thanks for the reply.
Yes i am using the tags module, as i have already posted there about 4.000 tags in the site. But i have already trying disabling them and nothing changed. The only thing i haven't done is to remove all the tags and see if there is any improvement. Another thing that i forgot to mention is the many tables of the joomla finder plugin that occupy 2/3 of the database. Here are all the database tables:







I tried to empty some tables but it broke the joomla administration and many menus disappeared. The strange thing is that the website works in the subfolder, it loads in 1-2 seconds. When i put it in the root, or make a redirection and it gets populated with traffic it hangs. Some other guy advised me to replicate the whole joomla installation in my PC using virtualbox, linux, php, nginx etc. and a web traffic tool like httperf or Jmeter to analyze the load. About the caching yes i heard also, or course you mean the database caching. I will test them in a local environment. Now about the Cloud hosting i didn't think of it. Actually i was thinking moving to a dedicated server, but then maybe the CDN is a better solution. But first of all we must eliminate the problem that makes the load of resources. The strange thing was that i put the website in maintenance mode, so only the admins could login, but again we have the same issues with the load. It only works in a subfolder.

Please Log in or Create an account to join the conversation.

  • Krikor Boghossian
  • Krikor Boghossian's Avatar
  • Offline
  • Platinum Member
More
8 years 6 months ago #148445 by Krikor Boghossian
Replied by Krikor Boghossian on topic Κ2 heavy load on large database, gateway timeouts
Yes, the finder plugin can go a bit rogue sometimes.

You need to disable the plugin and purge its results. This will also clear its tables. More than 1gb will be gone :)
Joomla! has 3 layers of caching but adding db caching is always a good idea. www.ostraining.com/blog/joomla/cache/

Finally before moving to a dedicated server (quite a bit of $$) try the CDN solution for a week or so.

I have seen results where 97% of the traffic is sent to the CDN. This means that only 3% of your visitors will access your site's server. This translates into way lower server requirements.

JoomlaWorks Support Team
---
Please search the forum before posting a new topic :)

Please Log in or Create an account to join the conversation.

More
8 years 6 months ago #148452 by Nick
Hello Krikor
Problem located... It was probably a DDoS attack, with htaccess i blocked all the traffic, and the website works in the root folder very fast and almost zero load on the server. Also with the trick of a pure html redirecting the traffic after 1-2 seconds in the subfolder, worked also, probably it blocked all the spammers in the root and they were taking a 404 error. Because with the php or htaccess method we were redirecting also the attackers, so the result was the same. I saw also many entries of the googlebot, at about 4-5 entries per minute, i don't know if that was a problem also. In any case i will try the CDN, thanks very much for the idea, i didn't think of it at all... As i read with CDN those kind of problems can be avoided, because the traffic is distributed. So if the problem is due to high traffic and attackers, with CDN we won't have those issues. I will try it for a week or so and see the results...

Please Log in or Create an account to join the conversation.

  • Krikor Boghossian
  • Krikor Boghossian's Avatar
  • Offline
  • Platinum Member
More
8 years 6 months ago #148453 by Krikor Boghossian
Replied by Krikor Boghossian on topic Κ2 heavy load on large database, gateway timeouts
I forgot about that completely,
When you change your site's IP address, the googlebot will scan your ENTIRE site. This will severely affect your site's speed and possibly the functionality as well.

You need to change the crawling speed / frequency in your Google webmasters tools account.

Finally, yes CDNs protect you against the vast majority of DDoS attacks.

JoomlaWorks Support Team
---
Please search the forum before posting a new topic :)

Please Log in or Create an account to join the conversation.

More
8 years 6 months ago - 8 years 6 months ago #148476 by Nick
Probably it was not the googlebot, in any case i minimized the crawl speed, but this problem was present before the migration of the server. Actually yesterday i analyzed the access log file and i saw over 14.000 unique IP's and almost 26.000 hits in the root, while the actual visitors was about 500. It is a tremendous number so it was flooding the whole VPS. Also i tried Cloudfare free service but nothing changed. When i move the website in root we are getting the same 504 gateway errors with the cloudfare error page. Maybe the change of hosting provider is the only solution at last...
Last edit: 8 years 6 months ago by Nick.

Please Log in or Create an account to join the conversation.

  • Krikor Boghossian
  • Krikor Boghossian's Avatar
  • Offline
  • Platinum Member
More
8 years 6 months ago #148482 by Krikor Boghossian
Replied by Krikor Boghossian on topic Κ2 heavy load on large database, gateway timeouts
I think Cloudflare's free service only caches the static files. Not the entire site. I am not sure, but if this is the case then spammers/ DDoS will still happen.

JoomlaWorks Support Team
---
Please search the forum before posting a new topic :)

Please Log in or Create an account to join the conversation.

More
8 years 6 months ago - 8 years 6 months ago #148503 by Nick
Yes they don't cover dynamic pages from those kind of attacks. Actually i analyzed the logs and found the suspect... It was indeed googlebot as i mentioned earlier, i saw 2-3 entries per second and over 150 per minute, so this was the reason that made the server unavailable, since it was making a DDoS attack. Now i have already blocked all the ip ranges, and got the site working in root, but i don't know if i blocked the legit googlebot also. Attempts to slow down the crawl speed didn't work, as i think the normal googlebot don't have that behavior, as they crawl the pages every 2-3 minutes and not twice to three times a second...
Last edit: 8 years 6 months ago by Nick.

Please Log in or Create an account to join the conversation.

  • Krikor Boghossian
  • Krikor Boghossian's Avatar
  • Offline
  • Platinum Member
More
8 years 6 months ago #148527 by Krikor Boghossian
Replied by Krikor Boghossian on topic Κ2 heavy load on large database, gateway timeouts
Actually the googlebot self-determines the optimal speed for crawling.
This means that the legit googlebot can cause a huge spike in your traffic and hence your db / server load.

If it was indeed the legit googlebot it will be over in a day or so max.

JoomlaWorks Support Team
---
Please search the forum before posting a new topic :)

Please Log in or Create an account to join the conversation.


Powered by Kunena Forum