-
Notifications
You must be signed in to change notification settings - Fork 7
Expand file tree
/
Copy pathQueries_to_support
More file actions
81 lines (71 loc) · 2.15 KB
/
Queries_to_support
File metadata and controls
81 lines (71 loc) · 2.15 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
The following are a list of queries that we regularly encounter. Based on this information we should validate the data model we are using.
# Scheme
## Machine Learning
We cannot assume we will capture all ML queries. However, we should provide a generic scheme that gets the bulk of what we currently need.
This will also provide a demonstration for people that wish to extend storage to support their own efforts.
### Result
I want all peinfo results. Maybe using only version 2, maybe only tagged with special prototype, and maybe withing the last 2 months.
Columns we care about:
- service_name
- service_version
- tags
- time range
### Object
Columns we care about:
- pretty much the same as web interface
### Submission
Columns we care about:
- pretty much the same as web interface
## Web Interface
This should be typical queries via a web interface. A standard API would also apply. Collecting large sets of data should be left to spark
### Result
Give me all the results for sha256. Maybe only with service_name, maybe with a specific version, and maybe only within a time range.
Columns we care about:
- sha256
- service_name
- service_version
- time range
### Object
Give me the object that matches a sha256
Give me all the objects that are of Type. Maybe only by mime, and maybe within a time range.
Columns we care about:
- sha256
- type
- source
- time range
- tags
### Submission
Give me the submissions that matches a sha256
Give me all the submissions that are under a source, maybe tagged with mediyes, maybe within a time range
Give me all submissions under a user_id
Columns we care about:
- sha256
- source
- time range
- tags
- user_id
## Config
# Proposed API scheme
### GET
Get one item back from the server
* v2/object/sha256
* v2/raw_data/sha256
* v2/result
* v2/submission
* v2/config
* v2/objects?{uuid, [identifier], mime}
* v2/results?{uuid, sha256, service_name}
* v2/submissions?{source, date}
### PATCH
* v2/object/sha256
### POST
* v2/object
* v2/raw_data
* v2/result
* v2/submissions
* v2/config
### DELETE
* v2/object:sha256
* v2/raw_data
* v2/result
* v2/submission