Using Nginx to perform A/B Test in PHP applications with Docker
For a year, we had the challenge of performing the evolution of a legacy application, developed for PHP 5.3 with an old and discontinued framework, for a PHP 7.3 environment.
Analyzing the possible problems and actions of this evolution were tasks such as Correct the framework, identify deprecated functionserrors, mitigate Fatal errors, and mitigate results errors.
While the flow of online users was no more than 500 online/hour, this was not an institutional website, but a system where employees used it for 24 hours.
What we specify here is a detailed evolution planning and detection of possible errors, in addition to describing the use of Nginx as a reverse proxy and segmenting probabilistically whether the user would access version 5.3 or 7.3, aiming to reduce the number of users affected and still identify critical navigation areas that would need attention.
If you want to know how you get the solution, we suggest you skip the planning part, where we find out how we identified the best time and how we identified migration errors.
To better identify the scenario by dividing the risk characteristics, we listed the two main characters of this evolution: Server (as machine) and Software.
Before evolution, the Apache server ran directly on the host, requiring the installation of several extra packages and manual module activations or configuration changes.
The planning process for the server has passed an arbitrary checklist, the items contained therein originated from previous experiences and not necessarily a pattern of evolution, containing:
Once you validate the server checklist because the items you list were direct impediments to evolution. And could generate demand for migration including hardware.
As we initiate the process of identifying existing accesses and errors. That segment between immediate myth able errors, non-severe errors, and fatal errors.
Identification of accesses
The GoAccess (https://goaccess.io/) use to identify the best time range to perform the migration. This was the first step because it helps in the planning of developers who can involve in the migration.
And how much time we would have to perform the whole procedure.
Installing GoAccess is simple and instructions can be obtained at https://goaccess.io/download#distro. Once you install, two files will transfer to a development computer.
These files were compressed and separated by one week per action of logrotate.(https://linux.die.net/man/8/logrotate).
Among the various information that GoAccess provides, the distribution of accesses per hour was the most important in this case. This was the distribution of one of the environments:
The important point in this chart: The red line is not on the same scale as the blue line and therefore can misinterpret as Hits < Visitors, which is not possible. According to the definition of the documentation:
A hit is a request (line in the access log), e.g., 10 requests = 10 hits. HTTP requests with the same IP, date, and user-agent you consider a unique visit.
we then have the ratio of possible hours to perform the evolution, which consists of stopping Apache because nginx will have to run on port 80. More details on the Run item.
Non-severe errors and Fatal errors
To identify existing errors in the production environment. We analyze the Apache domain error file.
In it, several errors of Notices and Warning were already appearing and the intention was to take visual note of the frequency and type of problem.
We believe that there are tools that can perform this metric with great statistical accuracy.
But for this case, it was enough to use Elasticsearch, Kibana, and Filestash to read an error file and verify how frequent the known PHP errors were.error.log
As this was a development-only analysis, a non-optimally configured docker container was used.
With the log file inserted in ElasticSearch, we were able to use Kibana to perform some queries and arrive at statistical results.
As shown in the image below. The image below shows only a piece of the error log for privacy reasons.
To my surprise: Warnings > Notice. That is, we have a delicate problem ahead because the natural flow of PHP evolution is to convert Notices into Warnings and Warnings in Fatal.
According to the language changelog, in addition to the possible RFCs(https://wiki.php.net/rfc/engine_warnings).
To better identify each Notice and each Warning, kibana search can be performed in Discovery > Filters:
The top filter contains in Kibana, and therefore can be a partial value, such as:
This will return all records, from the selected period, of errors that contain PHP Warning.
Perform the definition of all common records in a given period. We save errors to perform the comparison after the evolution of the environment.
Errors mitigated immediately:
For the identification of myth able errors, we use two tools: PhpCodeFixer (https://github.com/wapmorgan/PhpCodeFixer) and Php Code Standard Fixer (PhpCsFixer) (https://github.com/FriendsOfPHP/PHP-CS-Fixer)
The reason for using the two is for the sake of redundancy, both were developed by different people and by scope.
PhpCsFixer can perform as much as its own if the option is as already. PhpCodeFixer executes only the analysis details in which corrections should be made and why.
Using Nginx as a reverse proxy to switch between versions
Once you fix the code and functionality, the behavior of the software approves the test.
This application had no tests (unit or functional) and the change in the framework before the evolution to PHP 7.3.
So that autoload will become complex enough to make it impossible to write tests.
As we describe earlier, one of the characteristics of the servers was that the Apache server can install directly on the OS.
Therefore, after installing Docker and Docker compose (for convenience). A modified https://github.com/fabianofa/docker-php-devenv image was used, and the image https://github.com/fabianofa/docker-nginx-ab.
We describe functionality in the file README.md of the repository https://github.com/fabianofa/docker-nginx-ab.
But basically, the flow follows is: With each request. There is a probabilistic distribution of which version we use.
This is sent to the user in the cookie form. So that this can send again and repeat to each request.
That can allow the user to use the version of the application until the end of the period set to Max-Age or even the cleaning of cookies.
The diagram demonstrates the flow of requests:
- The user’s request reaches the server; The server defines which version it uses and generates a cookie to send in response.
- The request is then forwarded via one of the servers;proxy_pass
- Servers fulfill the request, typically isolated in terms of the session, PHP version, modules, packages, and extensions.
Note: In the image, the Apache version appears as 7.3; In this case, we can consider the version 5.3. However, the complexity is mitigated by the change in requests being split between a container with its environment and the host. If the host has an Apache + PHP 7.1, the implementation would be the same.
To make this work possible, the Host Apache server (formerly 5.3) now has its default port 8081 and 8443 blocked by the firewall.
Thus the request could occur freely internally, since the docker. When started, defines its network for communication between containers and host, commonly called .docker0
After this functionality was implemented, dates were defined where the percentage of the partitioning of either version would be changed as the code was corrected or mitigated notices and warnings.
It was simple to also guide the user to log around, clear browser cookies. And ask to log in again when a considerable problem occurred, by configuration or an error hitherto not manifested.
Changing the versions to the same name in the two percentages is 100% for a version that was consequently 5.3 and the max-age time reduction.
After correction, at the end of Max-Age, the user could then receive or not the new version.
After the operation, constant monitoring of the was mandatory. Mitigating the clear errors of Warnings and Notices, including some Fatals generated by changes to the framework, other errors arose from uniform configuration failure between Apache and Nginx.error.log
This server received file uploads and for this, there is a PHP directive that defines the upload size and post body, and respectively. These values had to be adequate in the file by the .upload_max_filsizepost_max_sizenginx.confclient_max_body_size