Most of the existing research focuses on electricity theft cyber-attacks in the consumption domain. On the contrary, a high penetration level of distributed generators (DGs) may result in increased electricity theft cyber-attacks in the distributed generation domain, which is the focus of this paper. In these attacks, malicious customers can hack into the smart meters monitoring their DG units, which are usually photovoltaic (PV), and manipulate their readings to report higher injected energy to the grid and claim more profit under feed-in tariff programs. This paper proposes a data-driven approach based on machine learning to detect such thefts. We adopt an anomaly detection approach where a theft detection unit (TDU) based on a regression tree model is designed to detect suspicious data. Historical records of solar irradiance, temperature, and smart meter readings are utilized in the training stage of the detector. The probability density function of the error between the actual readings from DG meters and the predicted generation by the regression model is utilized as a metric to detect suspicious data. Several theft scenarios are used to assess the performance of the TDU. Furthermore, a comparison study with other detectors is presented to demonstrate the superiority of the proposed TDU.