Home > Other Systems (CPVLab, iMobiTrax, P202, Track Revenue, Click, Google Analytics, etc.) > Normal Prosper202

Fixing The Prosper Slowdown Problem - FOR GOOD. (33)


06-19-2013 03:07 PM #1 caurmen (Administrator)
Fixing The Prosper Slowdown Problem - FOR GOOD.

So, this isn't going to be your regular Follow-Along.

Rather than a campaign, I'm going to be working on a different project - an attempt to fix, once and for all, one of the biggest Pains In Our Collective Ass in affiliate marketing.

This is original research, so I'm going to be posting my findings as I go, so that we can combine the awesome powers of the STM collective to kill this hulking great bugbear once and for all.


So What's The Problem?

Prosper 202 slows to a crawl when it has more than a million or so clicks recorded, and eventually crashes.

Sometimes this happens with even less clicks. Sometimes it happens with a lot more. Regardless, when it does happen it's a colossal pain in the butt.

This problem tends to kick in just as affiliates are starting to have their first real success, and it continues to plague us up to super-affiliate level. I've heard tales of people dumping their entire tracking database every day - or even every 6 hours - just to keep the machinery running.

It affects everything that's built on the Prosper codebase, and similar things also happen with CPVLab.


What Makes Me Think It's Fixable?

The database engine that Prosper 202 uses, MySQL, certainly isn't limited to only a couple of million records. Here's an example of MySQl forum code that delivers results from 500 million rows in 0.02 seconds.

In addition, I've performed SQL queries on a Prosper database that was experiencing extreme slowdown, using the command line rather than the Prosper interface, and found that it was still very responsive. That suggests that we're not hitting hard data limits, but rather something internal - that we can, with any luck, fix.


So What's The Plan?

Firstly, I'm going to create a new test server, and I'm going to either write or modify a program to generate a dataset of any size in Prosper. That will mean I can easily simulate the effects of having a few million clicks stored in Prosper without having to actually pay for all that traffic

TIP: once I've got that test program up and running, you'll be able to use it to check how any new server will handle a large Prosper database - very useful information to know in advance!

After that, I'm going to figure out what size of database slows Prosper down, but doesn't crash it completely, and what size of database just locks Prosper up. I'll then start debugging, using MySQL's Slow Queries Log, to determine what's causing the slowdown.

Once I know exactly where the problem lies, I'll either attempt to speed up a small subsection of the code - probably the Keyword Analysis View, initially - or start making database changes to speed up the system overall.

Of course, I will publish my results here as I go!


How Can We Help?

We've got some very big programming brains here at STM, of course - including the maintainer of Prosper! I'd definitely welcome any and all technical input in terms of things to try, things you've tried before, or things that I definitely shouldn't try because they'll melt my server.

If you're not a technical expert but you've run into the Prosper 202 slowdown problem before, it would also be very helpful to know how much data caused the problem, what spec of server you were running, and if upgrading your hardware helped at all!

Right, let's go. First update in a couple of hours.

Let's get this thing done.


06-20-2013 03:16 PM #2 caurmen (Administrator)

Day 1 progress

Version 0.1 of the Prosper Data-Testing Tool is finished, and you can get it from here. It's a couple of Ruby scripts (using Ruby 1.9.3) that will generate a Prosper click database of any size you like as a SQL file, and likewise, a keyword database of any size you want too.

Feel free to test it out if you like! If you don't know how to use it, don't worry - when it's a bit more advanced I will package it up with a GUI or at least a slightly more usable command-line and write instructions for it.

Currently it's quite simplistic: it only handles direct-link data, it doesn't produce unique timestamps on the data (so all clicks appear to have come in at the same time), and it only links clicks to the first keyword in your database.

Nonetheless, it's already giving some quite interesting results.

Results So Far

I'm testing on a 1Gb VPS running on DigitalOcean, using a standard Apache/MySQL setup. Because it's DigitalOcean, it's using SSDs, which I suspect make a huge difference to complex SQL queries.

Prosper started to slow down around the 100,000 recorded clicks mark, taking 3 seconds to produce a keyword report. At 500,000 recorded clicks, it was up to 11-12 sec to produce a keyword report.

Assuming that scales in a linear fashion, I'd expect the keyword reports to become effectively unusuable at about 4,000,000 clicks, which sounds about right.

Interestingly, MySQL doesn't eat massive quantities of memory when it's producing a report at these data levels. That implies that some configuration changes in MySQL, increasing memory allowances, might be the first things to test.

My initial assessment of the code that generates the keyword reports ( /tracking202/ajax/sort_keywords.php in the standard Prosper codebase ) shows that it appears to run a number of queries in a loop depending on the number of active keywords, so that might produce a HUGE slowdown with more keywords. However, I've done some initial testing on that, and it seems the code is efficient enough that it's not a simple geometric slowdown with more keywords.

Next Step

The next thing to check is whether those times for reports go up massively if the data is associated with a wide variety of keywords. I'm going to modify my data generator to associate clicks with keywords randomly, and also to spread out click times over a wider timeframe.

I'm also going to put some debugging code into Prosper, either using good old Echo commands or possibly NewRelic, and see if I can get an idea of what's going on, when and why, under the hood.

Incidentally, one tool I've been finding very useful is the Github repository for Prosper 202. You can browse and search the code there.

Couple of questions, one for the techies and one for everyone, as I keep developing this:

TECHIES: Any recommendations on what sort of data I should try to attempt to bring Prosper to its knees? I'm really looking for anything that will suddenly stop the wait times being a linear progression, and start being geometric.

NON-TECHIES: How much data have you found Prosper takes before it starts to behave poorly or not at all when it's processing?

Onward!


06-20-2013 04:15 PM #3 Mr Baffoe (Veteran Member)

For the hosted Tracking202 that I spun off the prosper202 codebase I'm using summary tables for the reporting. That's enabled me to generate most reports even with 10million+ click in under a second.

I still haven't got around to optimizing the keyword reports, I've modified some of the table structures(data types) and indexes to speed things up. There's still a lot more work to be done

Also please note that the prosper202 on github is 1.6.1 I'll push the latest version up and would love to actually get a community actively submitting pull requests with fixes and updates.

Check out https://github.com/avazu/Avazu_202 a Mongodb version of 202, I haven't scanned it to see if it's safe to use (no backdoors) but it looks very promising


06-20-2013 05:19 PM #4 caurmen (Administrator)

Awesome! Thanks, that's extremely useful. I'll look into both using summary tables and altering the schema to speed things up.

Would be very good to get the latest version on Github! I don't know how many commits I'll make at this point, but I'm sure I'll be doing something at least.

Avazu: that sounds very interesting. I've been considering writing a MongoDB-based tracker, so finding one that exists already is... nice.


06-21-2013 07:08 AM #5 bandit09 ()

Caurmen, this is awesome you're working to tackle this beast of a task....I salute you. I regularly have to export my prosper databases to keep things running smoothly for reporting, and I know lots of other people on here that haven't spoken up that have to do the same....One thing that seemed to helped my speeds was switching from Apache to Litespeed, so that may be something you could look into testing. I think collectively we can all make an already amazing software even better, more refined, and much faster. In the end I believe it would greatly benefit all Prosper users, with the end result being more profit for all of us.


06-21-2013 06:56 PM #6 Mr Baffoe (Veteran Member)

Quote Originally Posted by bandit09 View Post
Caurmen, this is awesome you're working to tackle this beast of a task....I salute you. I regularly have to export my prosper databases to keep things running smoothly for reporting, and I know lots of other people on here that haven't spoken up that have to do the same....One thing that seemed to helped my speeds was switching from Apache to Litespeed, so that may be something you could look into testing. I think collectively we can all make an already amazing software even better, more refined, and much faster. In the end I believe it would greatly benefit all Prosper users, with the end result being more profit for all of us.
Yep that was one of the reasons P202 was made open source, when we first launched we use to get decent number of mods and fixes contributed back into the source for the community, but these days not so much.


06-22-2013 01:34 AM #7 deondup (Member)

This is probably the only gripe I've got with Prosper. I'm having to clear my database every day or two and I've become too lazy to make database backups. Sometimes I will just export Excel files but its a real pain in the ass.

Having the ability to handle large amounts of data would be GREAT!

Another thing that I was thinking of was having a function to easily export all your data to a separate database and have the ability to view it from within Prosper.

I'm no programmer but I would think that's not too hard to do?


06-22-2013 12:26 PM #8 caurmen (Administrator)

@deondup - that's a definite backup plan! Hopefully, though, we'll be able to solve the scaling problem so you don't even need to export data any more.


06-23-2013 12:30 PM #9 offerkiller (Member)

MongoDB and NoSQL is the way to go, if someone could add mobile capacities to Avazu 202 and commit on github that would be great!


06-23-2013 01:59 PM #10 polarbacon (Moderator)

Quote Originally Posted by caurmen View Post
@deondup - that's a definite backup plan! Hopefully, though, we'll be able to solve the scaling problem so you don't even need to export data any more.
Not to be the negative nelly on this but I think setting a realistic goal with a set server spec is more applicable to what 95% of affs need....when you get into the millions of clicks a day to process that much data you need to look at other things besides just the code....

hence why hadoop exists....if google could do a search for 100 million data points on a 1gb vps...I think they would be doing it if it was possible....right?

building out a 202 that can handle 10 million or so data points and return it in a timely fashion (sub 30 sec)....AND not slow your redirects to a crawl while its doing that....WHILE running that on a BH 1gb VPS instance....

That to me is worth your time....

Not shitting on your effort...I think its great your taking this on....

But from your post thus far and maybe I am reading it totally wrong but just seems like we are shooting for mars on this one..yet we are only trying to get to the moon....


and you need to use the full 149$/ month version of new relic....once its on your server it can show you all the bottle necks in the code....execution times.....sql write times....etc....pretty much alot of what your looking to fix....so I would get that setup asap if I was you....


06-24-2013 10:52 AM #11 offerkiller (Member)

Quote Originally Posted by polarbacon View Post
Not to be the negative nelly on this but I think setting a realistic goal with a set server spec is more applicable to what 95% of affs need....when you get into the millions of clicks a day to process that much data you need to look at other things besides just the code....

hence why hadoop exists....if google could do a search for 100 million data points on a 1gb vps...I think they would be doing it if it was possible....right?

building out a 202 that can handle 10 million or so data points and return it in a timely fashion (sub 30 sec)....AND not slow your redirects to a crawl while its doing that....WHILE running that on a BH 1gb VPS instance....

That to me is worth your time....

Not shitting on your effort...I think its great your taking this on....

But from your post thus far and maybe I am reading it totally wrong but just seems like we are shooting for mars on this one..yet we are only trying to get to the moon....


and you need to use the full 149$/ month version of new relic....once its on your server it can show you all the bottle necks in the code....execution times.....sql write times....etc....pretty much alot of what your looking to fix....so I would get that setup asap if I was you....
The only way here to solve the issue once for all instead of clearing the database retardedly everyday or week is to either migrate the whole code base of P202 from mysql to a nosql storage system like mongodb or hadoop. If you push half million plus clicks per day on P202 you will be confronted with the below issues, no matter what server you use:

1. Depending on QPS and concurrency level of redirections your click redirection latency is HUGELY slowed down, that affects your revenue directly if you lose out 10% of your clicks due to bad latency

2. MySQL itself is doing table or coloumn based locks depending on storage engine, the issue here is when you try to retrieve a huge report within P202 backend the operation itself WILL ALSO AFFECT your redirection speed. The reason: P202 itself is a great business intelligence tool, but performs BADLY when trying to scale volume, if you look at their redirection php files they performs direct INSERT statements into tables, where the reporting features are retrieving data from. That is OVERKILL for performance, because if you run a report and your table is locked (MYISAM applies table-level locking and is the standard storage engine when using P202), then your redirection needs to WAIT until your report is retrieved!

3. If you come to the level with pushing 500k plus clicks per day you should be able to do 5 figures revenue daily, in this case you would not want to lose cash for system downtime, so your tracking system also NEEDS to be highly available. Here P202 fails as well because even when you have an experienced DBA or SA doing data replication and building some clusters for you, you CANNOT scale due to the code base. In order to scale with replication the code itself needs to be RW (read write) splitted.

Avazu 202 is implemented on Mongo DB and supports linear scale, that means if you push 1million clicks per server and you need to handle 10million clicks now you can just add 9 additional servers to handle this load. It is designed to handle millions of clicks without losing the convenience of P202's reporting capacities. As MongoDB supports auto sharding and replication natively Avazu 202 also supports high availability without single point of failure.

As a disclaimer here: I'm one of the initiators and author of Avazu 202, we noticed those above mentioned challenges already in 2010 and fixed those issues in 2011, Avazu 202 is a part of our restructured code base and we now use it running on systems pushing 8 figures of clicks on the daily base. The project itself is however not mobile ready yet and I would like to invite everyone to work on it, integrating mobile reporting, targeting and redirection features :-)


06-24-2013 10:59 AM #12 hd2010 (Member)

How about just change to TokuDB ?


06-24-2013 11:20 AM #13 caurmen (Administrator)

Not to be the negative nelly on this but I think setting a realistic goal with a set server spec is more applicable to what 95% of affs need....
Yep, fair points - I was maybe shooting a bit high with "never" having to export your data.

I do think we can probably get to the point where it's much less of an issue than it is today.

building out a 202 that can handle 10 million or so data points and return it in a timely fashion (sub 30 sec)....AND not slow your redirects to a crawl while its doing that....WHILE running that on a BH 1gb VPS instance....
That's pretty much where I'm shooting for - although we won't know quite what's possible until I've done some more research. As I'm sure you know, this sort of thing can be unpredicable. I might only be able to double or triple the amount of data Prosper can handle before locking up - or I might be able to massively improve it.

More work and more updates today/tomorrow!


06-24-2013 11:21 AM #14 caurmen (Administrator)

@hd2010 - Good suggestion! That's on the list for testing, along with testing a couple of the MySQL variants.

@offerkiller - you've got a very interesting project there! I'm going to take a look over the next couple of days.


06-25-2013 06:48 AM #15 Mr Baffoe (Veteran Member)

Quote Originally Posted by offerkiller View Post

As a disclaimer here: I'm one of the initiators and author of Avazu 202, we noticed those above mentioned challenges already in 2010 and fixed those issues in 2011, Avazu 202 is a part of our restructured code base and we now use it running on systems pushing 8 figures of clicks on the daily base. The project itself is however not mobile ready yet and I would like to invite everyone to work on it, integrating mobile reporting, targeting and redirection features :-)
Had no idea you were on the forum, looks like some pretty nice stuff you've done here and thanks for sharing.

Ping me nana@tracking202.com


06-25-2013 10:17 AM #16 Mr Green (Administrator)

I love where this thread is heading!


06-25-2013 04:45 PM #17 caurmen (Administrator)

Righty! Time to try some fixes.

So far, on my list to try are:



After I've tried all that lot, I'll move into more granular line-by-line code optimisations.


06-25-2013 05:06 PM #18 caurmen (Administrator)

Things have gone very well so far, I must say!

OPTIMIZE TABLES: 2x speed improvement!

OK, I wasn't expecting this one.

I'm using a fresh Prosper install which has just had a SQL document containing 500k rows read into it. Given the data's only just been added, I really wasn't expecting OPTIMIZE TABLES to have any effect. I only even tried it to make sure my MariaDB test (below) was fair.

I used my standard test: my 500k rows of data and a refresh of the Analyze Keywords screen. I repeated the test 3 times and averaged the results.

Result?

BEFORE: 11.2 sec
AFTER: 6.8 sec

CONCLUSION: Even if your Prosper has only been running for a day, OPTIMIZE TABLES has a pretty massive effect. It might even be enough to bring a Prosper install that has stopped running back from the brink.

Testing MariaDB: 3x speed improvement!

So, the first thing I was intending to try was replacing MySQL with the far more up-to-date drop-in replacement, MariaDB.

I wasn't expecting much from this test - maybe a 10%-20% speed improvement.

Installing MariaDB is not exactly trivial - given the results I've seen from this test, I'll write up a tutorial on how to do it later this week.

However, the speed improvement it gave was more than worth it.

Once again, I was using a 500k rows data set here, and timing a refresh of the Analyze Keywords screen. I repeated the test 3 times and averaged the results.

BEFORE: 6.8 sec.
AFTER: 2.5 sec.

That's a 172% improvement in speed from the change!

Not Too Shabby.

CONCLUSION: Run, do not walk, to install MariaDB. It has a very good reputation and I've not heard any horror stories, and it appears to have a major speed impact.

As I said, I'll write up instructions for changing to MariaDB as a tutorial, but for now, if you want to get started here's:



(I haven't tried the CentOS one, but all the steps look right.)

Next, to try the new table formats in MariaDB, before also giving TokuDB a whirl.

Comments? Suggestions? Please do let me know!


06-25-2013 08:07 PM #19 bbrock32 (Administrator)

The above post is pure gold!

It's something that requires minimal effort and has such a big impact.

Will def update all my P202 setups to MariaDB for the moment.


06-26-2013 05:18 AM #20 rafael (Member)

Wow I love this thread alread. I wish I had enough DB knowledge to help out more.

Thanks a ton for this caurmen!


06-26-2013 09:42 AM #21 caurmen (Administrator)

No problem! I was pretty startled at its impact myself.

Do let me know how the upgrade goes on your systems - I'll be very interested to see if the effects of MariaDB vary between servers, datasets, etc.


06-26-2013 09:53 AM #22 dynamicsoul (Member)

Have you tried http://www.postgresql.org/ ?

I run a site on a dedicated server which handles massive amounts of data each day, and seen a huge increase in speed, and overall reduction in load by swapping from mySql..


06-26-2013 10:37 AM #23 caurmen (Administrator)

I'll look into testing PostgreSQL - that's a good idea.

The major problem is likely to be that from what I recall last time I split-tested it with MySQL, it's not exactly a drop-in solution. I'm trying to avoid making huge changes to the Prosper codebase if I can, so anything that requires a rewrite of the entire schema and DB layer is probably somewhere I don't want to go immediately.


06-26-2013 10:52 AM #24 dynamicsoul (Member)

Yeah agree.. its not the simple solution.


06-26-2013 06:18 PM #25 caurmen (Administrator)

Testing Day 2

No such huge breakthroughs today, I'm afraid.

Today I've tested:



Next, I think it's time to take a really good look at what queries are running slow, and hit the indexing hard. I've already used EXPLAIN EXTENDED on a couple of the SQL queries, and I've seen at least one table where all 500k rows are being read every time, no index.


06-26-2013 09:01 PM #26 bbrock32 (Administrator)

Keep us updated with the new tests!


06-27-2013 08:21 AM #27 gritz (Member)

awesome! Gonna ditch MySQL for MariaDB asap!

Question: are you running Apache? Over the past few months I have transitioned every single one of my servers to Nginx...and for the first time my pages feel like they really load instantaneously. My RAM usage isn't taking a beating either. And for those wondering, I've been using this script to automate installation of everything.


06-27-2013 11:34 AM #28 caurmen (Administrator)

@bbrock - will do!

@gritz - yep, currently this server is running Apache. I run Nginx on other servers (and LiteSpeed on still others), but for this test I wanted to go for the most standard configuration.

For tracking, there are actually some solid arguments for running an Apache server. The implementation of mod_php has been tested, and it's actually somewhat faster than Nginx/PHP-FPM. Not a huge amount, but it is.

Additionally, I've found that a lot of the memory advantages of Nginx are offset on complex PHP scripts by the memory usage of PHP-FPM.

So my usual approach as of June 2013 is to use Nginx for anything that isn't using PHP - static files, particularly - and to use Apache on servers with plenty of RAM which are primarily going to run massive, hard-to-cache PHP scripts.

In this case, however, I'm pretty sure that the bottleneck isn't the web server, but the database or possibly the code - although you have just reminded me that I should probably test with a PHP accelerator in the mix!


06-27-2013 12:06 PM #29 gritz (Member)

Quote Originally Posted by caurmen View Post
@bbrock - will do!

@gritz - yep, currently this server is running Apache. I run Nginx on other servers (and LiteSpeed on still others), but for this test I wanted to go for the most standard configuration.

For tracking, there are actually some solid arguments for running an Apache server. The implementation of mod_php has been tested, and it's actually somewhat faster than Nginx/PHP-FPM. Not a huge amount, but it is.

Additionally, I've found that a lot of the memory advantages of Nginx are offset on complex PHP scripts by the memory usage of PHP-FPM.

So my usual approach as of June 2013 is to use Nginx for anything that isn't using PHP - static files, particularly - and to use Apache on servers with plenty of RAM which are primarily going to run massive, hard-to-cache PHP scripts.

In this case, however, I'm pretty sure that the bottleneck isn't the web server, but the database or possibly the code - although you have just reminded me that I should probably test with a PHP accelerator in the mix!
interadasting. On my VPS's, I've been running PHP APC and Memcache - I'm not sure if I only need 1 or the other, but with both installed and running it seems to work great. I've been keeping htop open while I run lots of Facebook traffic over the course of a few months, and I swear it seems like my Nginx setup has tooooons of RAM to spare (and I'm only using a 512!). Been super happy with Nginx in general. Love this thread, keep bringing the good datas!


06-27-2013 12:45 PM #30 caurmen (Administrator)

Nginx is the dog's bollocks, definitely, for an awful lot of things. I really like it as a server in general. (The fact that the darn thing can also cache better than Varnish, by some reports, doesn't hurt either.)

Interesting data - thanks! Are you tracking with Prosper on those servers?


02-22-2014 10:36 PM #31 afilia320 (Member)

^^ Nevermind, problem solved.

For anyone having the same issue, make sure you have phpmyadmin installed. I somehow skipped that from the installation guide anyways.

Thanks


02-22-2014 10:37 PM #32 afilia320 (Member)

BTW, has anyone tried to run STM mobile tracker on MariaDB? As it is a fork of Prosper, I tend to think it should work as well, but I would like to check if someone has already gone through the experience.

Thanks


02-24-2014 12:57 PM #33 caurmen (Administrator)

I can't recall if I have, but there shouldn't be any issues with doing that.


Home > Other Systems (CPVLab, iMobiTrax, P202, Track Revenue, Click, Google Analytics, etc.) > Normal Prosper202