Uploaded image for project: 'S2Graph'
  1. S2Graph
  2. S2GRAPH-226

Provide example spark jobs to explain how to utilize WAL log.

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Done
    • Minor
    • Resolution: Done
    • None
    • None
    • s2core, s2jobs
    • None

    Description

      Even though s2graph publish all incoming vertex/edge into Kafka, there is no example showing how to use this WAL log.

      I suggest adding a simple example showing how to process WAL and let me explain what use cases this example can benefit.

      At kakao, s2graph have been used as the fact storage, which store all user's activities such as click content, buy a product, search query.

      [{
      	"timestamp": 1,
      	"elem": "e",
      	"from": "steamshon",
      	"to": "s2graph",
      	"label": "search_query",
      	"props": {}
      }, {
      	"timestamp": 10,
      	"elem": "e",
      	"from": "steamshon",
      	"to": "github.com/apache/incubator-s2graph",
      	"label": "content_click",
      	"props": {}
      }, {
      	"timestamp": 12,
      	"elem": "v",
      	"id": "steamshon",
      	"serviceName": "s2graph",
      	"columnName": "user",
      	"props": {
      		"gender": "M"
      	}
      }]
      

      Each activity, label in s2graph words, consisting of their own graph, but when they are all connected together, then it gives much more information.

      Above edges can be aggregated as Vertex.

      It is up to users how to connect each graph, but in our case, we used `user` to merge multiple graphs. for example, we made each activity such as click content, buy a product, search query all use the same `userId` for the same `user`.

      Below is simple example data.

      {
      	"timestamp": 10,
      	"elem": "v",
      	"id": "steamshon",
      	"serviceName": "s2graph",
      	"columnName": "user",
      	"props": {
      		"gender": "M",
      		"edges": [{
      			"timestamp": 1,
      			"to": "s2graph",
      			"label": "search_query",
      			"props": {}
      		}, {
      			"timestamp": 10,
      			"to": "github.com/apache/incubator-s2graph",
      			"label": "content_click",
      			"props": {}
      		}]
      	}
      }
      

      This connected graph can be used not only for OLTP but also OLAP.

      I believe s2graph WAL log is good way to integrate OLTP and OLAP, and adding this example can help for user to understand how to leverage it.

      desing doc(work in progress)

      Attachments

        Issue Links

          Activity

            People

              steamshon Do Yung Yoon
              steamshon Do Yung Yoon
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - 336h
                  336h
                  Remaining:
                  Remaining Estimate - 336h
                  336h
                  Logged:
                  Time Spent - Not Specified
                  Not Specified