Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
1. According to your Java Doc of WorkUnit class
2. @deprecated Properties in
- {@link WorkUnit} contain a copy of {@link SourceState}
is a waste of memory. Use
{@link #create(Extract, WatermarkInterval)}.
3. So, QueryBasedSource class is creating a WorkUnit this way.
WorkUnit.create(extract)
There is no information about SourceState.
4. But, many extractors(including JdbcExtractor, MysqlExtractor and QueryBasedExtractor) are getting properties this way.
this.workUnit.getProp(source.conn.driver)
5. It can return null values..
So, in my opinion We should get properties from WorkUnitState rather than WorkUnit.
Github Url : https://github.com/linkedin/gobblin/issues/1065
Github Reporter : ggthename
Github Created At : 2016-06-23T08:55:05Z
Github Updated At : 2017-01-12T05:05:10Z
Comments
stakiar wrote on 2016-07-02T17:15:10Z : Hello @ggthename,
I'm not sure if I understand your question properly. Is this an actual bug you are seeing?
Some background:
- A `WorkUnitState` is just a wrapper around a `WorkUnit`, it just contains a few additional runtime properties
- A `WorkUnit` defines work that needs to be done in an individual Gobblin `Task`, usually a job will consist of many `WorkUnit`s where each `WorkUnit` consists of some division of work that needs to be done
- `SourceState` is the global configuration for an entire job, so it is basically a set of configuration properties global to all `WorkUnit`s
This wiki has some more documentation on how this all works: http://gobblin.readthedocs.io/en/latest/user-guide/State-Management-and-Watermarks/#gobblin-state-deep-dive
Github Url : https://github.com/linkedin/gobblin/issues/1065#issuecomment-230112363
chosh0615 wrote on 2016-07-03T09:23:21Z : As @ggthename mentioned, JdbcExtractor or QuerybasedExtractor classes are getting some properties from WorkUnit and get results getting null values.
For instance, properties about jdbc connection information should be read from WorkUnitState.
Since WorkUnitState.getProp function scans all properties in WorkUnitState itself, WorkUnit, and JobState, we can simply use WorkUnitState.getProp from the JdbcExtractor or QuerybasedExtractor.
Github Url : https://github.com/linkedin/gobblin/issues/1065#issuecomment-230143957
abti wrote on 2016-07-13T11:54:09Z : @ggthename Yes, this is a known issue that popped up with a few recent optimizations. It is being tracked here: https://github.com/linkedin/gobblin/issues/1022
Github Url : https://github.com/linkedin/gobblin/issues/1065#issuecomment-232333237
ggthename wrote on 2016-07-14T02:32:33Z : @abti - I think that it is a little different from #1022
Because, this issue is related to some propreties being read from WorkUnit not available in it anymore,
whereas the issue #1022 was about the properties being read from WorkUnitState not available in it.
Properties that are not available in WorkUnit but in WorkUnitState can be read by calling WorkUnitState.getProp(String) method.
This method tries to read property from WorkUnitState itself, and reads from WorkUnit and JobState if it does not find in WorkUnitState.
```
public String getProp(String key) {
String value = super.getProp(key);
if (value == null)
if (value == null)
{ value = this.jobState.getProp(key); } return value;
}
```
Github Url : https://github.com/linkedin/gobblin/issues/1065#issuecomment-232540916
jinhyukchang wrote on 2016-07-26T16:54:20Z : Was there any change how property is being populated into WorkUnit? With latest build, integration test on MySQLExtractor is having NPE due to this.
java.lang.NullPointerException
at gobblin.source.extractor.extract.jdbc.MysqlExtractor.getConnectionUrl(MysqlExtractor.java:172)
at gobblin.source.extractor.extract.jdbc.JdbcExtractor.createJdbcSource(JdbcExtractor.java:716)
at gobblin.source.extractor.extract.jdbc.JdbcExtractor.executePreparedSql(JdbcExtractor.java:675)
at gobblin.source.extractor.extract.jdbc.JdbcExtractor.extractMetadata(JdbcExtractor.java:289)
at gobblin.source.extractor.extract.QueryBasedExtractor.build(QueryBasedExtractor.java:244)
at gobblin.source.extractor.extract.jdbc.MysqlSource.getExtractor(MysqlSource.java:40)
at gobblin.runtime.TaskContext.getExtractor(TaskContext.java:119)
at gobblin.runtime.Task.run(Task.java:127)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Github Url : https://github.com/linkedin/gobblin/issues/1065#issuecomment-235332200
chavdar wrote on 2016-07-27T23:21:49Z : @tuGithub can you have a look? Your Salesforce changes might have addressed this issue.
Github Url : https://github.com/linkedin/gobblin/issues/1065#issuecomment-235751255
tuGithub wrote on 2016-07-28T17:47:10Z : looking...
Github Url : https://github.com/linkedin/gobblin/issues/1065#issuecomment-235971049
jinhyukchang wrote on 2016-08-15T22:13:01Z : Hi, Is there any update on this issue?
Github Url : https://github.com/linkedin/gobblin/issues/1065#issuecomment-239945453
chavdar wrote on 2016-08-16T00:10:54Z : @tuGithub have you had the time to look at this?
Github Url : https://github.com/linkedin/gobblin/issues/1065#issuecomment-239966844
jinhyukchang wrote on 2016-09-13T16:37:52Z : Hi, has this issue been resolved?
Github Url : https://github.com/linkedin/gobblin/issues/1065#issuecomment-246743380
lbendig wrote on 2016-09-13T20:32:08Z : @jinhyukchang The issue is still there and without patching the extractors, NPEs are thrown.
Github Url : https://github.com/linkedin/gobblin/issues/1065#issuecomment-246814167
chosh0615 wrote on 2016-09-14T04:32:44Z : There is a pull request about this issue. #1085
@jinhyukchang , you can merge this and try for now.
Github Url : https://github.com/linkedin/gobblin/issues/1065#issuecomment-246903908