Machine Learning
Machine Learning Overview
Machine Learning prediction is a Globalyzer workbench and Globalyzer Lite feature that help users handle false positive issues. We suggest applying machine learning as a follow-up step to scanning with Rule Sets. It helps to determine which candidate issues using Rule Sets are indeed i18n issues.
Installation
Prerequisite: Python 3.6.x and H2O.ai 3.x
1. Download Python version 3.6+ from website https://www.python.org/downloads/
2. Install python and add python to PATH environment variable
3. Go to this link http://h2o-release.s3.amazonaws.com/h2o/rel-wheeler/4/index.html and make sure you navigate to the "INSTALL IN PYTHON" tab as shown below.
Install dependencies (prepending with `sudo` if needed): pip install requests pip install tabulate pip install scikit-learn pip install colorama pip install future
At the command line, copy and paste these commands one line at a time:
pip uninstall h2o pip install http://h2o-release.s3.amazonaws.com/h2o/rel-wheeler/4/Python/h2o-3.16.0.4-py2.py3-none-any.whl
Success if response messages have "Successfully installed h2o-3.16.0.4"
Test1: Open System Command and type in "python -V", success if reply python version like "Python 3.6.2"
Test2: On the command line, go into python. In python:
> import h2o > h2o.init()
This should complete without errors.
Work Flow
Firstly, you need to create a globalyzer project with scans in Globalyzer client. At the Scan Results view, you could right mouse click on the issue that you determine it's a false positive issue, and choose "Mark prediction as false positive(F)" from the menu. Please at least marking several issues as false positives before applying "Find more false positives" under Machine Learning menu.
After marking some issues as false positives, please click "Find more false positives" button under "Machine Learning" menu, and wait the predicting process finish. There are three possible predictions "ML False", "ML NULL" and "ML True". Based on your marked issues and filtered issues by the rule set, machine learning will predict some issues as "ML False". "ML NULL" means machine learning can't predict this issue is an indeed issue or false positive, this is because the input data is not enough to let machine learning have a prediction for this issue.
If you find issues be predicted as "ML False" are indeed issues, you could right mouse click on the issue and select "Mark prediction as true positive(T)", and in next time you run "Find more false positives" machine learning will learn your correction. And if you are not satisfied with the prediction results, please continue marking more issues as "F" or "T", and rerun "Find more false positives".