Re-train AutoML Tabular model

Hi everyone,

I've built a successful AutoML model for tabular data in Google Cloud (Vertex AI) and have been pleased with the initial results. However, I've made some refinements to my training dataset and want to retrain the model to incorporate these changes.

My main goal is to maintain the exact configuration from my first model run. I want to ensure consistency while retraining with the updated data. Is there a straightforward way to achieve this?

Ideally, I'd like a streamlined approach to preserve the existing configuration while simply updating the dataset. Any suggestions or best practices would be greatly appreciated!

Thanks in advance!

0 1 32
1 REPLY 1

Gemini suggest this gcloud CLI but it won't work:

gcloud ai-platform pipelines runs create \
--region=us-central1 \
--pipeline-service-account=service-XXX@gcp-sa-aiplatform.iam.gserviceaccount.com \
--pipeline-root=gs://your-bucket/your-pipeline-root \
--display-name=automl-tabular-20240628050148 \
--location=us-central1 \
--project=retco9 \
--service-account=service-XXXX@gcp-sa-aiplatform.iam.gserviceaccount.com \
--template=gs://google-cloud-aiplatform/schema/trainingjob/definition/automl_tabular_1.0.0.yaml \
--parameter=model.display_name=automl-tabular-20240628050148 \
--parameter=model.prediction_type=regression \
--parameter=model.optimization_objective=minimize-rmse \
--parameter=model.training_fraction_split=0.8 \
--parameter=model.validation_fraction_split=0.1 \
--parameter=model.test_fraction_split=0.1 \
--parameter=model.disable_early_stopping=false \
--parameter=model.max_parallel_workers=10 \
--parameter=model.budget_milli_node_hours=1000 \
--parameter=model.model_type=AUTOML_TENSORFLOW_ESTIMATOR \
--parameter=model.optimization_problem_type=regression \
--parameter=model.target_column=your_target_column \
--parameter=model.transformations.transformations.categorical.categorical_encoding_type=ONE_HOT \
--parameter=model.transformations.transformations.numerical.numerical_scaling_type=MIN_MAX \
--parameter=model.transformations.transformations.text.text_encoding_type=TFIDF \
--parameter=model.transformations.transformations.time_series.time_series_encoding_type=AUTO \
--parameter=model.transformations.transformations.image.image_encoding_type=AUTO \
--parameter=model.transformations.transformations.video.video_encoding_type=AUTO \
--parameter=model.transformations.transformations.audio.audio_encoding_type=AUTO \
--parameter=model.transformations.transformations.structured.structured_encoding_type=AUTO \
--parameter=model.transformations.transformations.unstructured.unstructured_encoding_type=AUTO \
--parameter=model.transformations.transformations.tabular.tabular_encoding_type=AUTO \
--parameter=model.transformations.transformations.categorical.categorical_encoding_type=ONE_HOT \
--parameter=model.transformations.transformations.numerical.numerical_scaling_type=MIN_MAX \
--parameter=model.transformations.transformations.text.text_encoding_type=TFIDF \
--parameter=model.transformations.transformations.time_series.time_series_encoding_type=AUTO \
--parameter=model.transformations.transformations.image.image_encoding_type=AUTO \
--parameter=model.transformations.transformations.video.video_encoding_type=AUTO \
--parameter=model.transformations.transformations.audio.audio_encoding_type=AUTO \
--parameter=model.transformations.transformations.structured.structured_encoding_type=AUTO \
--parameter=model.transformations.transformations.unstructured.unstructured_encoding_type=AUTO \
--parameter=model.transformations.transformations.tabular.tabular_encoding_type=AUTO \
--parameter=model.transformations.transformations.categorical.categorical_encoding_type=ONE_HOT \
--parameter=model.transformations.transformations.numerical.numerical_scaling_type=MIN_MAX \

But it fails with ERROR: (gcloud.ai-platform) Invalid choice: 'pipelines'.