To expose the data in DBs that are ingested by DataLake, we need create a general framework. Here is the work flow.  

      1. In DL configuration DB, a new table “data_exposure” has the following columns:
        id: type varchar (example ‘ns_user_number’)
        sql_template: example:  
        select from_unixtime(commonEventHeader.lastEpochMicrosec/1000) as timeStamp,
        commonEventHeader.sourceId as id,
        sum(measurementFields.additionalFields."AMF.RegSub._NS") as userNumber
        from amf1
        where commonEventHeader.sourceId = '${id}'
        and from_unixtime(commonEventHeader.lastEpochMicrosec/1000) <= from_iso8601_timestamp( '${timeStamp}')
        group by commonEventHeader.lastEpochMicrosec, commonEventHeader.sourceId
        order by commonEventHeader.lastEpochMicrosec desc
        limit ${dataCount}
        db_id: a FK to DL DB table.
      2. A data consumer client sends a request via url like:
        /datalake/v1/exposure/userNumber
        with parameters
        {
        "id": "460_08_010101",
        "timeStamp": "2019-09-26T21:54:27",
        "dataCount": 16
        }
      3. Data Exposure Service (DES) reads the request id ‘userNumber’ from url, looks up table “data_exposure”, finds sql template, which is then populated with the parameters in HTTP request. 
      4. DES sends the constructed sql, along with DB (MongoDB) info, to Presto using JDBC query, and get the results as JDBS ResultSet, which DES then convert to JSON, like

       
      {
          "result": [
              {
                  "timeStamp": "2019-09-26T13:54:27.000+0000",
                  "id": "460_08_010101",
                  "userNumber": 100
              },
              {
                  "timeStamp": "2019-09-26T12:54:27.000+0000",
                  "id": "460_08_010101",
                  "userNumber": 100
              }
          ],
          "request": {
              "id": "460_08_010101",
              "timeStamp": "2019-09-26T21:54:27",
              "dataCount": "16"
          },
          "result_count": 2
      }

      5. The JSON is sent back to the client as REST response.

       

       -----------------------------------------------------

      Here is an example row in MongoDB table(collection) that the above example queries:

       

      {

          "commonEventHeader": {

              "sourceId": "460_08_010101",

              "startEpochMicrosec": 1569501000000,

              "eventId": "",

              "timeZoneOffset": "GMT+08:00",

              "reportingEntityId": "3201HWHX1AMF1003223461889",

              "eventType": "AMF",

              "priority": "High",

              "version": "V0.4.0",

              "nfVendorName": "Huawei",

              "reportingEntityName": "NJ-AMF-1-A-HW",

              "sequence": "",

              "domain": "measurement",

              "lastEpochMicrosec": 1569502467000,

              "eventName": "AMF",

              "vesEventListenerVersion": "7.1",

              "sourceName": "460_08_010101"

          },

          "measurementFields": {"additionalFields": {"AMF.RegSub._NS": 10}}

      }

            guobiaomo guobiaomo
            guobiaomo guobiaomo
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: