How to structure data before uploading it to the platform

Modified on Sat, 4 May, 2024 at 10:47 AM

Your uploaded files must follow a structure in order for models to be successfully generated. Machine Learning tabular guidelines are not always compatible with typical engineering tabular practices. 


Requirements

When uploading data to the Monolith platform, please make sure it satisfies the following requirements.

  • Each column should only contain one variable
  • Rows should contain indices of experiment or sequence stamp (e.g. test number, simulation ID, time)
  • Each column of the dataset should be classified as a type of data (e.g. string, float, etc.), thus it is not recommended to have more than one data type in the same column
  • When using categorical data (i.e. strings), consistency is key. The platform assimilates “spaces” and is case sensitive, so “MonolithAI”, “Monolith AI” and “Monolith ai” are all classified as different categories
  • The labelling of columns should only be 1 row (i.e. nested categorization should not be used in labelling)

Examples

The examples below highlight in red regions of the data that do not satisfy the requirements:

  • Nested categorization of columns:

    Front sideRear side
    TimeHeightForceTempHeightForceTemp
    0.898260434149.1872944161062.50950145102.9695565143569.99225872
    0.06824658126.9437422174369.81139095100.1677129171972.39496438
    0.223245541131.2395906170470.92918715104.4267423135579.02104831
    0.198662073138.8477895134765.37057207146.4874195138062.5873467
    0.584823038132.2786137140670.06819137131.7884798176561.55247709
  • Cells merged (and missing data):
    TimeF_HeightF_ForceF_TempR_HeightR_ForceR_Temp
    0.383839842121.1055164168863.7233866101.46120841496
    0.484601597125.11046641638141.86516381495
    0.830386192103.66756871773136.6600568179860.14558905
    0.93157364122.41552211791137.486305133878.97866845
    0.067163228105.3975624139272.02537429139.48317241501
  • Inconsistent strings in a categorical column:
    FailureHeightForceTemp
    Compression138.5704731136276.61910893
    compression117.1162992142262.95551895
    Tension108.2897529147574.16961723
    in compression104.5295069167761.75505769
    tension123.3339419177078.61567623
  • The example below is typical format of software outputs. However, for the platform each column must be a variable and rows must be a sequential stamp (e.g. time) or different experiments.
    Step 1
    Time0.72615801
    Height104.614631Force:1756Temp:72.38695876
    Step 2
    Time0.452290895
    Height139.1372833Force:1442Temp:74.92448148
    Step 3
    Time0.624463615
    Height147.5907217Force:1330Temp:64.89705344
    Step 4
    Time0.323668304
  • The example below follows the correct structure that data should be uploaded to the platform
    TimeF_HeightF_ForceF_TempR_HeightR_ForceR_Temp
    0.519042911123.9004088156771.39432069120.7590165167970.3345152
    0.902826988124.4544405138261.637524121.0811838147064.33987705
    0.869587281109.1762983134274.69825678119.9179117170762.49954938
    0.083198722145.9851669140760.32229119144.0284513136772.07843594
    0.18762859122.550792158465.540395110.247237177867.59115131

Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select at least one of the reasons
CAPTCHA verification is required.

Feedback sent

We appreciate your effort and will try to fix the article