The main objective of this work is to investigate efficient estimation of the optimal design variables that maximize net present value (NPV) for the life-cycle production optimization during a single-well CO2 huff-n-puff (HnP) process in unconventional oil reservoirs. This work extends our previous work where we considered only well control variables such as injection rate and production BHP, and duration of injection and production periods as the optimal design variables using a single, simple unconventional reservoir model ignoring the effects of double permeability and geomechanical effects in life-cycle production optimization. In this work, we also add length of each cycle as a design variable into set of our design variables. A more realistic unconventional reservoir model is considered, where Bakken oil composition is used as reservoir fluid, and natural fractures and geomechanical effects are considered. In addition, applications of robust life-cycle optimization treating uncertainty in reservoir model by a set (ensemble) of reservoir models and maximizing NPV over a suite of reservoir models are given. During optimization, the NPV is calculated by a machine learning (ML) proxy model trained to accurately approximate the NPV that would be calculated from a reservoir simulator run. As ML algorithms we used both least-squares support vector regression (LS-SVR) and Gaussian process regression (GPR). Given forward simulation results with a commercial compositional simulator that simulates miscible CO2 HnP process a proxy is built based on the ML method chosen. Having the proxy model, we use it in the iterative training-optimization algorithm directly to optimize the design variables. As an optimization tool the sequential quadratic programming (SQP) method is used inside this iterative training-optimization algorithm. Computational efficiencies of the ML proxy-based optimization methods are compared with that of the conventional stochastic simplex approximate gradient (StoSAG) method and/or simplex gradient method. Our results show that the LS-SVR and GPR based proxy models prove to be accurate and useful in approximating NPV in optimization of the CO2 HnP process. The results also indicate that both the GPR and LS-SVR methods exhibit very similar convergence rates and require similar computational time for optimization. Both ML based methods prove to be quite efficient in production optimization, saving significant computational times (at least 4 times more efficient) over a stochastic gradient computed from a high fidelity compositional simulator directly in a gradient ascent algorithm. To the best of our knowledge, this work is the first presenting a detailed investigation of LS-SVR and GPR applications in comparison with StoSAG and simplex to the optimal well-control problem for a complex miscible CO2 HnP process in unconventional oil reservoirs. We provide insight and information on proper training of the SVR and GPR proxies for this type life-cycle production optimization problem.