[SPARK-20767] The training continuation for saved LDA model - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Minor
Resolution: Duplicate
Affects Version/s: 2.1.1
Fix Version/s: None
Component/s: ML
Labels:
None

Description

Current online implementation of the LDA model fit (OnlineLDAOptimizer) does not support the model update (ie. to account for the population/covariates drift) nor the continuation of model fitting in case of the insufficient number of iterations.

Technical aspects:

1. The implementation of LDA fitting does not currently allow the coefficients pre-setting (private setter), as noted by a comment in the source code of OnlineLDAOptimizer.setLambda: "This is only used for testing now. In the future, it can help support training stop/resume".

2. The lambda matrix is always randomly initialized by the optimizer, which needs fixing for preset lambda matrix.

The adaptation of the classes by the user is not possible due to protected setters & sealed / final classes.

Attachments

Issue Links

duplicates

SPARK-20082 Incremental update of LDA model, by adding initialModel as start point

Resolved

Activity

People

Assignee:: Unassigned

Reporter:: Cezary Dendek

Votes:: 1 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 16/May/17 11:41

Updated:: 26/May/17 08:51

Resolved:: 26/May/17 08:51