Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SAMZA-2420: Update CommandLine to use config loader for local config file #1256

Merged
merged 7 commits into from
Jan 30, 2020
Merged

Conversation

kw2542
Copy link
Contributor

@kw2542 kw2542 commented Jan 22, 2020

This is backward incompatible and is only supposed to be included in Samza 1.5

Design:
https://cwiki.apache.org/confluence/display/SAMZA/SEP-23%3A+Simplify+Job+Runner

Changes:

  1. Update CommandLine to use config loader instead of config factory
  2. Removed common.properties and adds its values to each job config
  3. Update CheckpointTool to read new offsets from local file only.

API Changes:

  1. Add config-loader-factory and config-loader-properties in CommandLine to support specifying ConfigLoaderFactory and its properties needed to load config.
  2. Remove config-factory and config-path in CommandLine to discountinue the usage of ConfigFactory
  3. Update CheckpointTool to read new offsets from local file only, i.e. --new-offsets only supports local file URI now.

Upgrade Instructions:

All usages in CommandLine and its subclasses will switch from --config-factory & --config-path to --config-loader-factory & --config-loader-properties, including job launch.

Usage Instructions

For example,

--config-factory=org.apache.samza.config.factories.PropertiesConfigFactory --config-path=file:///location/file.txt

will be changed to

--config-loader-factory=org.apache.samza.config.loaders.PropertiesConfigLoaderFactory --config-loader-properties path=file:///location/file.txt

Tests

  1. Unit tests
  2. Test against Hello Samza example with

deploy/samza/bin/run-app.sh --config-loader-factory=org.apache.samza.config.loaders.PropertiesConfigLoaderFactory --config-loader-properties path=$PWD/deploy/samza/config/wikipedia-feed.properties

Ke Wu added 4 commits January 21, 2020 16:08
…file

Design:
https://cwiki.apache.org/confluence/display/SAMZA/SEP-23%3A+Simplify+Job+Runner

Changes:
1. Update CommandLine to use config loader instead of config factory
2. Removed common.properties and adds its values to each job config
3. Update CheckpointTool to read new offsets from local file only.

API Changes:
1. Add config-loader-factory and config-loader-properties in CommandLine to support specifying ConfigLoaderFactory and its properties needed to load config.
2. Remove config-factory and config-path in CommandLine to discountinue the usage of ConfigFactory
3. Update CheckpointTool to read new offsets from local file only.

Upgrade Instructions:

All usages in CommandLine and its subclasses will switch from --config-factory & --config-path to --config-loader-factory & --config-loader-properties, including job launch.

Usage Instructions

For example,

--config-factory=org.apache.samza.config.factories.PropertiesConfigFactory --config-path=file:///location/file.txt

will be changed to

--config-loader-factory=org.apache.samza.config.loaders.PropertiesConfigLoaderFactory --config-loader-properties path=file:///location/file.txt

Tests

1. Unit tests
2. Test against Hello Samza example with

  deploy/samza/bin/run-app.sh --config-loader-factory=org.apache.samza.config.loaders.PropertiesConfigLoaderFactory --config-loader-properties path=$PWD/deploy/samza/config/wikipedia-feed.properties
Copy link
Contributor

@mynameborat mynameborat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You will need to follow up this patch with another patch for samza-hello-samza.
Looks like website documentation also needs to be updated.

// Verify legitimate parameters.
if (!options.has(configPathOpt)) {
if (!options.has(configLoaderPropertiesOpt)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this check for configLoaderFactoryOpt instead? Theoretically, the config loader factory won't need additional properties. You might also need to somehow validate or catch that not all config loader properties were specified for the config loader.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

configLoaderFactoryOpt has a default value of PropertiesConfigLoaderFactory.

This check is to be consistent with the previous check where we require a config path (URI) to pass to the legacy ConfigFactory, which in theory may not need URI either.

We may remove this check completely too, if we want to support the case where config loader factory does not need any properties at all.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the previous case (i.e. ConfigFactory), the configURI was a required argument for the interface, and I don't think you can have an empty URI. For ConfigLoaderFactory interface, it is required to pass a Config, but it is possible to have an empty Config, so I think you could just remove the check to handle the general case.
The additional complexity that the ConfigLoaderFactory brings is that each concrete type requires different configs to be in Config, so if there is an error due to a missing config value, then that's maybe where you need to do the parser.printHelpOn(System.err); System.exit(-1);.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense, removed the check.

The default concrete PropertiesConfigLoaderFactory is checking the properties itself too.

@cameronlee314 cameronlee314 merged commit 1a03a6a into apache:master Jan 30, 2020
@kw2542 kw2542 deleted the SAMZA-2420 branch January 30, 2020 22:44
cameronlee314 added a commit to cameronlee314/samza that referenced this pull request Feb 8, 2020
cameronlee314 added a commit that referenced this pull request Feb 8, 2020
… config file (#1256)" (#1270)

This reverts commit 1a03a6a.

Reverting this change for the 1.4.0 branch since it is backwards incompatible and we don't want to release this for 1.4.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
3 participants