-
Notifications
You must be signed in to change notification settings - Fork 261
Refactor sysfs to improve sanitization #2661
Refactor sysfs to improve sanitization #2661
Conversation
cf889cb
to
8a87585
Compare
Rebased to latest master with few updates. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 10 of 19 files at r1, 10 of 10 files at r2, all commit messages.
Reviewable status: all files reviewed, 38 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @vijaydhanraj)
-- commits, line 7 at r2:
We don't use Pal-SGX
but instead Pal/Linux-SGX
. This will make sense once we have e.g. Pal/Windows-SGX
(I know, I know, never gonna happen :)).
Please fix here and in all other commit messages (you can rebase).
-- commits, line 11 at r2:
[Pal, LibOS]
-- commits, line 13 at r2:
Just remove this sentence. By the way, we never say "patch", typically we say "commit" or "change".
-- commits, line 16 at r2:
...on untrusted integers
(If I understood correctly)
-- commits, line 18 at r2:
Sounds weird. I suggest 3. Convert the integers to Linux-formatted strings.
No need to say "before the app actually starts", it is obvious.
-- commits, line 20 at r2:
Sometimes you have s
on the end of verbs, sometimes without it (e.g., refactors
vs reconvert
). Please make uniform.
-- commits, line 20 at r2:
to
-> into
-- commits, line 24 at r2:
How can this be a separate commit? Doesn't it mean that the previous commit is unusable by itself? (If you git checkout
that commit, Graphene won't even compile, right?)
-- commits, line 28 at r2:
ditto, how can this be a separate commit?
-- commits, line 36 at r2:
Simply [LibOS] Add...
-- commits, line 39 at r2:
Please remove #2453
(we never put PR numbers, especially given that we're moving to a new org). Also remove Just adding it back
-- it is obvious.
common/include/api.h, line 84 at r2 (raw file):
#define BITS_IN_BYTE 8 #define BITS_IN_TYPE(type) (sizeof(type) * BITS_IN_BYTE) #define BITS_TO_INT(nr) DIV_ROUND_UP(nr, BITS_IN_TYPE(int))
For uniformity, it should be BITS_TO_INTS
LibOS/shim/include/shim_fs_pseudo.h, line 211 at r2 (raw file):
bool sys_cpu_online_name_exists(struct shim_dentry* parent, const char* name); int sys_cpu_online_list_names(struct shim_dentry* parent, readdir_callback_t callback, void* arg); int sys_convert_int_to_str(PAL_NUM val, SIZE_QUALIFIER_T size_qual, char* str, int max_len);
We don't use types for enums. We simply use int
. So SIZE_QUALIFIER_T size_qual
-> int size_qual
.
LibOS/shim/include/shim_fs_pseudo.h, line 211 at r2 (raw file):
bool sys_cpu_online_name_exists(struct shim_dentry* parent, const char* name); int sys_cpu_online_list_names(struct shim_dentry* parent, readdir_callback_t callback, void* arg); int sys_convert_int_to_str(PAL_NUM val, SIZE_QUALIFIER_T size_qual, char* str, int max_len);
I dislike PAL_NUM
type. Let's just replace with uint64_t
.
LibOS/shim/include/shim_fs_pseudo.h, line 211 at r2 (raw file):
bool sys_cpu_online_name_exists(struct shim_dentry* parent, const char* name); int sys_cpu_online_list_names(struct shim_dentry* parent, readdir_callback_t callback, void* arg); int sys_convert_int_to_str(PAL_NUM val, SIZE_QUALIFIER_T size_qual, char* str, int max_len);
Please replace int max_len
with size_t size
. This is more common in the C world (see e.g. man snprintf
)
LibOS/shim/include/shim_fs_pseudo.h, line 212 at r2 (raw file):
int sys_cpu_online_list_names(struct shim_dentry* parent, readdir_callback_t callback, void* arg); int sys_convert_int_to_str(PAL_NUM val, SIZE_QUALIFIER_T size_qual, char* str, int max_len); int sys_convert_range_info_str(PAL_RES_RANGE_INFO res_range_info, char* str, int max_len,
PAL_RES_RANGE_INFO
is a struct. We never pass structs by value, only by reference. So:
PAL_RES_RANGE_INFO res_range_info
-> const PAL_RES_RANGE_INFO* res_range_info
LibOS/shim/include/shim_fs_pseudo.h, line 212 at r2 (raw file):
int sys_cpu_online_list_names(struct shim_dentry* parent, readdir_callback_t callback, void* arg); int sys_convert_int_to_str(PAL_NUM val, SIZE_QUALIFIER_T size_qual, char* str, int max_len); int sys_convert_range_info_str(PAL_RES_RANGE_INFO res_range_info, char* str, int max_len,
..._str
-> ..._to_str
LibOS/shim/include/shim_fs_pseudo.h, line 214 at r2 (raw file):
int sys_convert_range_info_str(PAL_RES_RANGE_INFO res_range_info, char* str, int max_len, const char* sep); int sys_convert_range_info_bitmap_str(PAL_RES_RANGE_INFO res_range_info, char* str, int max_len);
ditto (pass by reference)
LibOS/shim/include/shim_fs_pseudo.h, line 214 at r2 (raw file):
int sys_convert_range_info_str(PAL_RES_RANGE_INFO res_range_info, char* str, int max_len, const char* sep); int sys_convert_range_info_bitmap_str(PAL_RES_RANGE_INFO res_range_info, char* str, int max_len);
..._str
-> ..._to_str
LibOS/shim/include/shim_fs_pseudo.h, line 214 at r2 (raw file):
int sys_convert_range_info_str(PAL_RES_RANGE_INFO res_range_info, char* str, int max_len, const char* sep); int sys_convert_range_info_bitmap_str(PAL_RES_RANGE_INFO res_range_info, char* str, int max_len);
What is the difference between the two last functions? They have almost the same arguments. Please add a comment.
LibOS/shim/src/fs/proc/info.c, line 116 at r2 (raw file):
} while (0) uint64_t online_logical_cores = g_pal_control->topo_info.online_logical_cores.resource_count;
uint64_t
-> size_t
(they are equivalent, but you want your code to be uniform -- currently you have one var as uint64_t
and the other as size_t
)
LibOS/shim/src/fs/sys/cache_info.c, line 30 at r2 (raw file):
const char* name = dent->name; PAL_CORE_CACHE_INFO* cache = &g_pal_control->topo_info.core_topology[cpu_num].cache[cache_num]; char str[PAL_SYSFS_MAP_FILESZ] = {'\0'};
PAL_SYSFS_MAP_FILESZ
is 256B in size. This is still a small size to allocate on the stack (as done here), but we should be very careful with allocating strings on the stack.
Please add a comment where you define PAL_SYSFS_MAP_FILESZ
that it is typically allocated on the stack and thus should be reasonably small-sized.
LibOS/shim/src/fs/sys/cache_info.c, line 32 at r2 (raw file):
char str[PAL_SYSFS_MAP_FILESZ] = {'\0'}; if (strcmp(name, "shared_cpu_map") == 0) { ret = sys_convert_range_info_bitmap_str(cache->shared_cpu_map, str, PAL_SYSFS_MAP_FILESZ);
Here and everywhere, replace PAL_SYSFS_MAP_FILESZ
with sizeof(str)
LibOS/shim/src/fs/sys/cpu_info.c, line 21 at r2 (raw file):
if (strcmp(name, "online") == 0) { ret = sys_convert_range_info_str(g_pal_control->topo_info.online_logical_cores, str,
All the same comments apply to this file.
LibOS/shim/src/fs/sys/fs.c, line 39 at r2 (raw file):
const char* sep) { if (res_range_info.range_count > INT64_MAX) return -EINVAL;
Please add a newline after this
LibOS/shim/src/fs/sys/fs.c, line 47 at r2 (raw file):
int ret; char end_str[PAL_SYSFS_BUF_FILESZ] = {'\0'};
Why do you need end_str
? You can just perform several snprintf(str + offset, ...)
in this for-loop body. This will be easier to read. Also please add brief comments. E.g., UINT64_MAX
looks like a magical number you use (for whatever reason).
LibOS/shim/src/fs/sys/fs.c, line 75 at r2 (raw file):
(flag & 0xf00) >> 8, \ (flag & 0xf0) >> 4, \ (flag & 0xf)
What's the point in these macros if you only use them once? Just embed in C code.
LibOS/shim/src/fs/sys/fs.c, line 90 at r2 (raw file):
if (res_range_info.range_count > INT64_MAX) return -EINVAL;
Add newline
LibOS/shim/src/fs/sys/fs.c, line 123 at r2 (raw file):
ret = 0; out_bitmap:
We simply call labels like this out
, no point in fancy names.
LibOS/shim/src/fs/sys/fs.c, line 136 at r2 (raw file):
if (strcmp(parent_name, "node") == 0) { pal_total = g_pal_control->topo_info.nodes.resource_count;
Why this field is called resource_count
, why not simply count
? It belongs to nodes
, so why the additional word?
Pal/include/pal_internal.h, line 261 at r2 (raw file):
void* calloc(size_t nmem, size_t size); void free(void* mem); void* realloc_size(void* ptr, size_t old_size, size_t new_size);
I'm not exactly happy with this function... Can we refactor the code that uses it so that we don't need this function?
Pal/include/arch/x86_64/pal-arch.h, line 281 at r2 (raw file):
#define KILO_BYTE 1024 #define MEGA_BYTE (KILO_BYTE * 1024) #define GIGA_BYTE (MEGA_BYTE * 1024)
First, just use these numbers directly (everybody knows what 1024
is, and it never changes).
Second, are you sure you need to multiply all your sysfs resources by 1024
and not by 1000
? There is a difference between kibi (1024
) and kilo (1000
).
Pal/include/arch/x86_64/pal-arch.h, line 288 at r2 (raw file):
MEGA, GIGA, } SIZE_QUALIFIER_T;
Please remove the enum type name -- we never use it for enums. Also, please remember that enum items are global in the whole codebase, so you should give them meaningful prefixes. Finally, no need to write = 0
, this is already guaranteed by the C standard. So I suggest this:
enum {
SIZE_QUALIFIER_NONE,
SIZE_QUALIFIER_KILO,
SIZE_QUALIFIER_MEGA,
SIZE_QUALIFIER_GIGA,
};
Pal/include/arch/x86_64/pal-arch.h, line 294 at r2 (raw file):
INSTRUCTION, UNIFIED, } CACHE_TYPE_T;
enum {
CACHE_TYPE_DATA,
CACHE_TYPE_INSTRUCTION,
CACHE_TYPE_UNIFIED,
};
Pal/include/arch/x86_64/pal-arch.h, line 314 at r2 (raw file):
typedef struct PAL_RES_RANGE_INFO_ { PAL_NUM resource_count; PAL_NUM range_count;
What is the difference between these two counts? I guess one contains "total number of resources" and the other one contains "number of ranges"? Please add a short comment.
Pal/include/arch/x86_64/pal-arch.h, line 315 at r2 (raw file):
PAL_NUM resource_count; PAL_NUM range_count; PAL_RANGE_INFO* ranges;
One more pointer... I guess this leads to very cumbersome malloc/realloc/free
patterns in the code. Can we just hard-code some reasonable number, like PAL_RANGE_INFO ranges[8]
? This should simplify the code significantly.
Pal/src/host/Linux-common/topo_info.c, line 31 at r2 (raw file):
} return tmp; }
Why is this function here? We already have realloc_size
.
Pal/src/host/Linux-SGX/db_main.c, line 339 at r2 (raw file):
PAL_NUM range_count; PAL_RANGE_INFO* ranges; } PAL_SOCKET_INFO;
Why is it declared here? If it is truly needed, can we hard-code the size of the array like PAL_RANGE_INFO ranges[8]
?
8a87585
to
0dc6f10
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 9 of 20 files reviewed, 38 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @dimakuv)
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
We don't use
Pal-SGX
but insteadPal/Linux-SGX
. This will make sense once we have e.g.Pal/Windows-SGX
(I know, I know, never gonna happen :)).Please fix here and in all other commit messages (you can rebase).
:) yes done.
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
[Pal, LibOS]
Done.
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Just remove this sentence. By the way, we never say "patch", typically we say "commit" or "change".
Done.
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
...on untrusted integers
(If I understood correctly)
yes, done.
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Sounds weird. I suggest
3. Convert the integers to Linux-formatted strings.
No need to say "before the app actually starts", it is obvious.
how about Convert integers back to Linux-formatted strings in LibOS.
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Sometimes you have
s
on the end of verbs, sometimes without it (e.g.,refactors
vsreconvert
). Please make uniform.
Done.
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
to
->into
Done.
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
How can this be a separate commit? Doesn't it mean that the previous commit is unusable by itself? (If you
git checkout
that commit, Graphene won't even compile, right?)
yes, you are right. I have merged /proc/cpuinfo to use topo info from PAL_TOPO_INFO
commit and Update set/get affinity syscall
commit into Refactor sysfs to improve sanitization
.
I think I can pull out 4th item (consolidate all topology related ...
) of Refactor sysfs to improve sanitization
into a separate commit if that helps.
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
ditto, how can this be a separate commit?
Done.
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Simply
[LibOS] Add...
Done.
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Please remove
#2453
(we never put PR numbers, especially given that we're moving to a new org). Also removeJust adding it back
-- it is obvious.
Done.
common/include/api.h, line 84 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
For uniformity, it should be
BITS_TO_INTS
Done.
LibOS/shim/include/shim_fs_pseudo.h, line 211 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
We don't use types for enums. We simply use
int
. SoSIZE_QUALIFIER_T size_qual
->int size_qual
.
Done, instead of int I changed it to uint64_t
LibOS/shim/include/shim_fs_pseudo.h, line 211 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
I dislike
PAL_NUM
type. Let's just replace withuint64_t
.
Done.
LibOS/shim/include/shim_fs_pseudo.h, line 211 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Please replace
int max_len
withsize_t size
. This is more common in the C world (see e.g.man snprintf
)
yes, you are right but I put this for uniformity. We have 3 string manipulation functions, 1) sys_convert_int_to_str
, 2) sys_convert_range_info_str
and 3) sys_convert_range_info_bitmap_str
. All these functions use snprintf but for functions 2 and 3 check for negative offsets. Please see https://github.com/gramineproject/graphene/pull/2661/files#diff-9e25c4e2e0ea819880c207dd80fa7acbeec1f5d1d822502427abd1c2b70a71e4R43 and https://github.com/gramineproject/graphene/pull/2661/files#diff-9e25c4e2e0ea819880c207dd80fa7acbeec1f5d1d822502427abd1c2b70a71e4R111. Having size_t
type might cause the compiler to convert even offset
to unsigned int
and this check will never be less than 0. Given this, I kept as int max_len
even for sys_convert_int_to_str
function.
How about adding an assert
or condition to check that max_len
is not negative here? Please note this is an internal LibOS function and we always pass a #define
positive value here.
LibOS/shim/include/shim_fs_pseudo.h, line 212 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
PAL_RES_RANGE_INFO
is a struct. We never pass structs by value, only by reference. So:
PAL_RES_RANGE_INFO res_range_info
->const PAL_RES_RANGE_INFO* res_range_info
Done. Curious, are you suggesting passing by reference to save stack space?
LibOS/shim/include/shim_fs_pseudo.h, line 212 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
..._str
->..._to_str
Done, also renamed the function from sys_convert_range_info_to_str
to sys_convert_range_to_str
.
LibOS/shim/include/shim_fs_pseudo.h, line 214 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
ditto (pass by reference)
Done
LibOS/shim/include/shim_fs_pseudo.h, line 214 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
..._str
->..._to_str
Done, also renamed the function from sys_convert_range_info_bitmap_str
to sys_convert_range_bitmap_to_str
LibOS/shim/include/shim_fs_pseudo.h, line 214 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
What is the difference between the two last functions? They have almost the same arguments. Please add a comment.
There are two types of ranges we encounter, 1) 0-63
to represent a numerical range like the number of cpus online 2) 00000000, ffffffff
to represent bitmap like core_siblings. To convert ranges of 1st type, I have sys_convert_range_info_to_str
and to convert ranges of 2nd type I have sys_convert_range_to_cpu_bitmap_str
.
I have added the below comments to the function declaration as well.
/* Converts integers to string. If size qualifier like 'K', 'M' or 'G' is passed, it is appended
* to the string. */
int sys_convert_int_to_str(uint64_t val, int size_qual, char* str, int max_len);
/* Converts integer ranges to string. For example if ranges.start and ranges.end were 0 and 63
* respectively, then `0-63` string is generated. */
int sys_convert_range_to_str(const PAL_RES_RANGE_INFO* res_range_info, char* str, int max_len,
const char* sep);
/* Converts integer ranges to cpu bitmap strings. For example, if ranges.start and ranges.end were
* 32 and 63, then `ffffffff,00000000` string is generated */
int sys_convert_range_to_cpu_bitmap_str(const PAL_RES_RANGE_INFO* res_range_info, char* str,
int max_len);
LibOS/shim/src/fs/proc/info.c, line 116 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
uint64_t
->size_t
(they are equivalent, but you want your code to be uniform -- currently you have one var asuint64_t
and the other assize_t
)
I have changed type of n
variable fromsize_t
->unit64_t
LibOS/shim/src/fs/sys/cache_info.c, line 30 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
PAL_SYSFS_MAP_FILESZ
is 256B in size. This is still a small size to allocate on the stack (as done here), but we should be very careful with allocating strings on the stack.Please add a comment where you define
PAL_SYSFS_MAP_FILESZ
that it is typically allocated on the stack and thus should be reasonably small-sized.
Added this note, This MACRO is used to allocate on the stack, so increase the size if needed with caution or use malloc.
LibOS/shim/src/fs/sys/cache_info.c, line 32 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Here and everywhere, replace
PAL_SYSFS_MAP_FILESZ
withsizeof(str)
Done.
LibOS/shim/src/fs/sys/cpu_info.c, line 21 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
All the same comments apply to this file.
Done.
LibOS/shim/src/fs/sys/fs.c, line 39 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Please add a newline after this
Done.
LibOS/shim/src/fs/sys/fs.c, line 47 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Why do you need
end_str
? You can just perform severalsnprintf(str + offset, ...)
in this for-loop body. This will be easier to read. Also please add brief comments. E.g.,UINT64_MAX
looks like a magical number you use (for whatever reason).
end_str
is a helper variable. Usually, a range is something like 0-63
and we have ranges[0].start = 0
and ranges[0].end = 63
but sometimes we non-hyphenated values, like numa distance which looks like 20 10
. Here there will be 2 ranges, ranges[0].start = 20
and ranges[0].end = UINT64_MAX
and ranges[0].start = 10
and ranges[0].end = UINT64_MAX
. Since we don't know if the range is hyphenated or not, I use this helper variable to construct the final string.
Regarding UINT64_MAX
, I do have a brief comment here, https://github.com/gramineproject/graphene/pull/2661/files#diff-dc2a8b7f2d092ccddbc75cab14472b1af87c24992bc2a6aefc742e15bb965cd9R309. I didn't want to repeat this in multiple places.
LibOS/shim/src/fs/sys/fs.c, line 75 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
What's the point in these macros if you only use them once? Just embed in C code.
Felt the MACRO will make the code look cleaner. But have removed it.
LibOS/shim/src/fs/sys/fs.c, line 90 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Add newline
Done.
LibOS/shim/src/fs/sys/fs.c, line 123 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
We simply call labels like this
out
, no point in fancy names.
Done.
LibOS/shim/src/fs/sys/fs.c, line 136 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Why this field is called
resource_count
, why not simplycount
? It belongs tonodes
, so why the additional word?
typedef struct PAL_RES_RANGE_INFO_ {
PAL_NUM resource_count;
PAL_NUM range_count;
PAL_RANGE_INFO* ranges;
} PAL_RES_RANGE_INFO;
As you can see, there are 2 count variables as part of this struct which is used to represent a node
or online_logical_core
resource. To differentiate between the two, I have named it this way.
Say for example, this path /sys/devices/system/cpu/online
gives a string like 0-31, 45-63
. We store the total online logical cores count in resource_count
which will be 51
and range_count
will be 2
(0-31
, 45-63
)
Pal/include/pal_internal.h, line 261 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
I'm not exactly happy with this function... Can we refactor the code that uses it so that we don't need this function?
We can save few lines in ../host/Linux-SGX/db_main.c
by reusing this function from slab.c
Pal/include/arch/x86_64/pal-arch.h, line 281 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
First, just use these numbers directly (everybody knows what
1024
is, and it never changes).Second, are you sure you need to multiply all your sysfs resources by
1024
and not by1000
? There is a difference between kibi (1024
) and kilo (1000
).
OK, removed the MACROs
Regarding multiplier, it is used to represent cache size which is in bytes.
Pal/include/arch/x86_64/pal-arch.h, line 288 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Please remove the enum type name -- we never use it for enums. Also, please remember that enum items are global in the whole codebase, so you should give them meaningful prefixes. Finally, no need to write
= 0
, this is already guaranteed by the C standard. So I suggest this:enum { SIZE_QUALIFIER_NONE, SIZE_QUALIFIER_KILO, SIZE_QUALIFIER_MEGA, SIZE_QUALIFIER_GIGA, };
OK, but renamed as below to avoid ambiguity between kilo
and kilo byte
and to reduce the length a bit.
enum {
MULTIPLIER_NONE,
MULTIPLIER_KB,
MULTIPLIER_MB,
MULTIPLIER_GB,
};
Pal/include/arch/x86_64/pal-arch.h, line 294 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
enum { CACHE_TYPE_DATA, CACHE_TYPE_INSTRUCTION, CACHE_TYPE_UNIFIED, };
I don't think we need explicitly call out CACHE_TYPE
based on code and the name itself is self-explanatory. So just removing the enum type name.
Pal/include/arch/x86_64/pal-arch.h, line 314 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
What is the difference between these two counts? I guess one contains "total number of resources" and the other one contains "number of ranges"? Please add a short comment.
Please refer to the earlier comment on this.
Added a short comment as well.
Pal/include/arch/x86_64/pal-arch.h, line 315 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
One more pointer... I guess this leads to very cumbersome
malloc/realloc/free
patterns in the code. Can we just hard-code some reasonable number, likePAL_RANGE_INFO ranges[8]
? This should simplify the code significantly.
yes, agree it will be much simpler, but this will be an issue when we introduce core-offlining. For example, we can have a configuration like, 1-2
4-5
, 7-8
10-11 ...
and so on until we hit 127th core. This will easily exceed reasonable limits like 8
. One good thing is we can avoid complications with freeing
of these resources as if any condition fails then we terminate right away. If not, we need these variables until the user quits.
Pal/src/host/Linux-common/topo_info.c, line 31 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Why is this function here? We already have
realloc_size
.
Looks like we might need to modify sources to use realloc_size
here. So, I have refactored the code and removed this function.
Pal/src/host/Linux-SGX/db_main.c, line 339 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Why is it declared here? If it is truly needed, can we hard-code the size of the array like
PAL_RANGE_INFO ranges[8]
?
Currently for socket info, sysfs just tells us which core is part of which socket. We get this from, /sys/devices/system/cpu/cpuX/topology/physical_package_id
. Something like CPU0->socket 1.... CPU32->socket 2
. But I am converting this info to a bitmap, something like socket[0] -> 0-1, 3-5, 5-6, 7-8....31
and socket[1]->33-34, 36-37, 49-40....63
so that I can compare this against the core_sibling
which basically tells all the cores within a socket.
So, I introduced this PAL_SOCKET_INFO
struct which is a little different than PAL_RANGE_INFO
which only has start
and end
of range. Here we also need to keep track of the range_count
as it is unique to each socket. We could have used PAL_RANGE_INFO
struct but this also has resource count which we don't need here.
This struct is only used for sanitizing the sockets. How about I declare this struct inside the function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 11 of 11 files at r3, all commit messages.
Reviewable status: all files reviewed, 72 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @vijaydhanraj)
a discussion (no related file):
Given your other PR on getcpu()
support, what should be the order of merging this PRs?
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
yes, done.
Actually, I still don't understand step 2. Aren't you supposed to 2. Sanitize CPU/NUMA information based on untrusted integers
? I mean, it's not about sysfs paths
because at this point you don't have any paths -- you just have an internal representation of the CPU/NUMA topology which is represented as a bunch of integers.
Please reword.
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
how about
Convert integers back to Linux-formatted strings in LibOS.
Sounds good.
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
yes, you are right. I have merged
/proc/cpuinfo to use topo info from PAL_TOPO_INFO
commit andUpdate set/get affinity syscall
commit intoRefactor sysfs to improve sanitization
.I think I can pull out 4th item (
consolidate all topology related ...
) ofRefactor sysfs to improve sanitization
into a separate commit if that helps.
No, no need to create a separate commit. This looks good.
-- commits, line 7 at r3:
This is a strange message. What you do here is add a new realloc_size()
function. I don't see how adding something new is "exposing internal implementation". Am I missing something? Do you have two realloc_size()
functions now? If yes, then why?
LibOS/shim/include/shim_fs_pseudo.h, line 211 at r2 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
yes, you are right but I put this for uniformity. We have 3 string manipulation functions,
1) sys_convert_int_to_str
,2) sys_convert_range_info_str
and3) sys_convert_range_info_bitmap_str
. All these functions use snprintf but for functions 2 and 3 check for negative offsets. Please see https://github.com/gramineproject/graphene/pull/2661/files#diff-9e25c4e2e0ea819880c207dd80fa7acbeec1f5d1d822502427abd1c2b70a71e4R43 and https://github.com/gramineproject/graphene/pull/2661/files#diff-9e25c4e2e0ea819880c207dd80fa7acbeec1f5d1d822502427abd1c2b70a71e4R111. Havingsize_t
type might cause the compiler to convert evenoffset
tounsigned int
and this check will never be less than 0. Given this, I kept asint max_len
even forsys_convert_int_to_str
function.How about adding an
assert
or condition to check thatmax_len
is not negative here? Please note this is an internal LibOS function and we always pass a#define
positive value here.
I see. Yes, please add an assert then.
LibOS/shim/include/shim_fs_pseudo.h, line 212 at r2 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Done. Curious, are you suggesting passing by reference to save stack space?
Thanks. No, it's not about saving stack space (which of course is also true, but not the main point). We do this because that's the norm in C world, so the C developer who will read PAL_RES_RANGE_INFO res_range_info
in 5 years from now will immediatelly assume that the type PAL_RES_RANGE_INFO
is a typedef for some "pointer to some struct", and that res_range_info
was passed by reference. Of course, quite quickly the developer will figure out this is not the case and will write us some hate mails :)
Basically, this change is to stick to the principle of least surprise. 99.9% of the C code is written this way, and there is no reason in this case to be in that 0.01%.
LibOS/shim/include/shim_fs_pseudo.h, line 214 at r2 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
There are two types of ranges we encounter, 1)
0-63
to represent a numerical range like the number of cpus online 2)00000000, ffffffff
to represent bitmap like core_siblings. To convert ranges of 1st type, I havesys_convert_range_info_to_str
and to convert ranges of 2nd type I havesys_convert_range_to_cpu_bitmap_str
.I have added the below comments to the function declaration as well.
/* Converts integers to string. If size qualifier like 'K', 'M' or 'G' is passed, it is appended * to the string. */ int sys_convert_int_to_str(uint64_t val, int size_qual, char* str, int max_len); /* Converts integer ranges to string. For example if ranges.start and ranges.end were 0 and 63 * respectively, then `0-63` string is generated. */ int sys_convert_range_to_str(const PAL_RES_RANGE_INFO* res_range_info, char* str, int max_len, const char* sep); /* Converts integer ranges to cpu bitmap strings. For example, if ranges.start and ranges.end were * 32 and 63, then `ffffffff,00000000` string is generated */ int sys_convert_range_to_cpu_bitmap_str(const PAL_RES_RANGE_INFO* res_range_info, char* str, int max_len);
OK. Thanks, the comments help.
LibOS/shim/include/shim_fs_pseudo.h, line 216 at r3 (raw file):
int sys_convert_int_to_str(uint64_t val, uint64_t size_mult, char* str, int max_len); /* Converts integer ranges to a string. For example if ranges.start and ranges.end were 0 and 63
Why do you say ranges
? Isn't it just one range that you convert? If it's just one range, why do you say ranges.start
/ranges.end
(and not e.g. range.start
)?
If it is not one range but an array of ranges, please specify it in the comment (and also give a better example, like 0-63,64-127
string is generated).
UPDATE: I looked at the code. So this is an array of ranges. Please update your comment to make it more explicit that this is an array, e.g., describe an example with two ranges. Also, the function name is misleading, right? It should be sys_convert_ranges_to_str
(notice the plural).
LibOS/shim/include/shim_fs_pseudo.h, line 218 at r3 (raw file):
/* Converts integer ranges to a string. For example if ranges.start and ranges.end were 0 and 63 * respectively, then `0-63` string is generated. */ int sys_convert_range_to_str(const PAL_RES_RANGE_INFO* res_range_info, char* str, int max_len,
Use the plural in function name (ranges
)
LibOS/shim/include/shim_fs_pseudo.h, line 219 at r3 (raw file):
* respectively, then `0-63` string is generated. */ int sys_convert_range_to_str(const PAL_RES_RANGE_INFO* res_range_info, char* str, int max_len, const char* sep);
By the way, what is the need for sep
? Isn't it always ,
(comma)?
LibOS/shim/include/shim_fs_pseudo.h, line 221 at r3 (raw file):
const char* sep); /* Converts integer ranges to a cpu bitmap string. For example, if ranges.start and ranges.end were
Why do you say ranges
? Isn't it just one range that you convert? If it's just one range, why do you say ranges.start
/ranges.end
?
LibOS/shim/src/fs/sys/cache_info.c, line 57 at r3 (raw file):
} else if (strcmp(name, "physical_line_partition") == 0) { ret = sys_convert_int_to_str(cache->physical_line_partition, MULTIPLIER_NONE, str, sizeof(str));
I quite like this new code! Very readable.
LibOS/shim/src/fs/sys/cpu_info.c, line 16 at r3 (raw file):
int sys_cpu_general_load(struct shim_dentry* dent, char** out_data, size_t* out_size) { int ret = 0;
No reason to = 0
. But not blocking.
LibOS/shim/src/fs/sys/fs.c, line 47 at r2 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
end_str
is a helper variable. Usually, a range is something like0-63
and we haveranges[0].start = 0
andranges[0].end = 63
but sometimes we non-hyphenated values, like numa distance which looks like20 10
. Here there will be 2 ranges,ranges[0].start = 20
andranges[0].end = UINT64_MAX
andranges[0].start = 10
andranges[0].end = UINT64_MAX
. Since we don't know if the range is hyphenated or not, I use this helper variable to construct the final string.Regarding
UINT64_MAX
, I do have a brief comment here, https://github.com/gramineproject/graphene/pull/2661/files#diff-dc2a8b7f2d092ccddbc75cab14472b1af87c24992bc2a6aefc742e15bb965cd9R309. I didn't want to repeat this in multiple places.
Ok, two things.
-
Why magic value of
UINT64_MAX
? If you have a range consisting of only one value, then you can always specifyranges[0].start = ranges[0].end = 20
, which gives you the single value20
. I highly dislike this magic value, when there is a human-readable alternative. -
I still don't see a need in
end_str
. You could simply do:
if (res_range_info->ranges[i].start == res_range_info->ranges[i].end) {
// print a single value
snprintf(str + offset, max_len - offset, "%lu%s", res_range_info->ranges[i].start, (i + 1 == range_cnt) ? "\0" : sep);
} else {
// print a range
snprintf(str + offset, max_len - offset, "%lu-%lu%s", res_range_info->ranges[i].start, res_range_info->ranges[i].end, (i + 1 == range_cnt) ? "\0" : sep);
}
LibOS/shim/src/fs/sys/fs.c, line 44 at r3 (raw file):
int offset = 0; for (int64_t i = 0; i < range_cnt; i++) { if (max_len - offset < 0)
Why so weird? Why not if (offset > max_len)
? This is much more readable ("if current offset exceeds the maximum length").
This also allows you to change the types from int
to uint64_t
everywhere.
LibOS/shim/src/fs/sys/fs.c, line 69 at r3 (raw file):
int sys_convert_range_to_cpu_bitmap_str(const PAL_RES_RANGE_INFO* res_range_info, char* str, int max_len) { if (g_pal_control->topo_info.possible_logical_cores.resource_count > INT64_MAX)
Why do you have this check here? Shouldn't this check be in the code where you init g_pal_control->topo_info.possible_logical_cores.resource_count
?
LibOS/shim/src/fs/sys/fs.c, line 76 at r3 (raw file):
int64_t possible_cores = g_pal_control->topo_info.possible_logical_cores.resource_count; int64_t num_cpumask = BITS_TO_INTS(possible_cores); unsigned int* bitmap = (unsigned int*)calloc(num_cpumask, sizeof(unsigned int));
Please use uint32_t
instead of unsigned int
LibOS/shim/src/fs/sys/fs.c, line 91 at r3 (raw file):
return -EINVAL; for (int64_t j = (int64_t)start; j <= (int64_t)end; j++) { int64_t index = j / (sizeof(int) * BITS_IN_BYTE);
Don't you have BITS_IN_INT
to replace this (sizeof(int) * BITS_IN_BYTE)
?
LibOS/shim/src/fs/sys/fs.c, line 96 at r3 (raw file):
goto out; } bitmap[index] |= 1U << (j % (sizeof(int) * BITS_IN_BYTE));
ditto
LibOS/shim/src/fs/sys/fs.c, line 103 at r3 (raw file):
int offset = 0; for (int64_t j = num_cpumask - 1; j >= 0; j-- ) { if (max_len - offset < 0) {
ditto (use more straight-forward check)
LibOS/shim/src/fs/sys/node_info.c, line 17 at r3 (raw file):
const char* name = dent->name; char str[PAL_SYSFS_BUF_FILESZ] = {'\0'}; int ret = 0;
Please move to the very beginning of this function. Also, = 0
is not needed here.
LibOS/shim/src/fs/sys/node_info.c, line 52 at r3 (raw file):
ret = sys_convert_int_to_str(numa_topology->nr_hugepages[HUGEPAGES_1G], MULTIPLIER_NONE, str, sizeof(str)); }
You should also add else { ... failed to recognize hugepages ... }
here.
Pal/include/arch/x86_64/pal-arch.h, line 288 at r2 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
OK, but renamed as below to avoid ambiguity between
kilo
andkilo byte
and to reduce the length a bit.enum { MULTIPLIER_NONE, MULTIPLIER_KB, MULTIPLIER_MB, MULTIPLIER_GB, };
Sounds good.
Pal/include/arch/x86_64/pal-arch.h, line 294 at r2 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
I don't think we need explicitly call out
CACHE_TYPE
based on code and the name itself is self-explanatory. So just removing the enum type name.
Wait, how is this "self-explanatory"? When I look at this file pal-arch.h
, I have no idea what these constants mean. As a future reader of this code, I shouldn't grep our whole code base to realize that these constants are used to describe CPU cache types. And UNIFIED
keyword can mean anything... I still want CACHE_TYPE_
prefix here (or whatever you think better describes it).
Pal/include/arch/x86_64/pal-arch.h, line 315 at r2 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
yes, agree it will be much simpler, but this will be an issue when we introduce core-offlining. For example, we can have a configuration like,
1-2
4-5
,7-8
10-11 ...
and so on until we hit 127th core. This will easily exceed reasonable limits like8
. One good thing is we can avoid complications withfreeing
of these resources as if any condition fails then we terminate right away. If not, we need these variables until the user quits.
OK. Let's leave like this.
Pal/include/arch/x86_64/pal-arch.h, line 35 at r3 (raw file):
/* Used to represent buffers having numeric values with text. E.g "1024576K". * Note: This MACRO is used to allocate on the stack, so increase the size if needed with caution or * use malloc. */
No need to capitalize the word macro
. Also, you can shorten to just:
NOTE: Used to allocate on stack; increase with caution or use malloc instead.
Pal/include/arch/x86_64/pal-arch.h, line 39 at r3 (raw file):
/* Used to represent cpumaps like "00000000,ffffffff,00000000,ffffffff". * Note: This MACRO is used to allocate on the stack, so increase the size if needed with caution or * use malloc. */
ditto
Pal/include/arch/x86_64/pal-arch.h, line 309 at r3 (raw file):
typedef struct PAL_RANGE_INFO_ { PAL_NUM start; PAL_NUM end; /* If end is not present, set this to UINT64_MAX as end marker. */
No need for this magic value. If start == end
, then it's not a range but a single value.
Pal/include/arch/x86_64/pal-arch.h, line 315 at r3 (raw file):
/* Count of total number of resources present. Eg. 0-63 will result in this count being 64 */ PAL_NUM resource_count; /* Count of total number of ranges present. Eg. 0-31,32-63, will result in this count being 2 */
Please remove trailing ,
in the example
Pal/include/arch/x86_64/pal-arch.h, line 332 at r3 (raw file):
typedef struct PAL_CORE_TOPO_INFO_ { /* [0] element is uninitialized because core 0 is always online */
Actually, I don't understand how this comment concerns this struct. Did it get stale?
Pal/include/host/Linux-common/topo_info.h, line 14 at r3 (raw file):
* HW resources present in the file (if count == true) and stores the result in `PAL_RES_RANGE_INFO` * struct if provided or simply reads the integer stored in the file (if count == false). If * `size_multiplier` is passed, then size qualifier like "K"/"M"/"G" are stored while reading the
size_multiplier
-> size_mult
are
-> is
Also, size qualifier like "K"/"M"/"G"
sounds like you will save the string into size_mult
Pal/src/host/Linux/arch/x86_64/db_arch_misc.c, line 167 at r3 (raw file):
} } ci->cpu_socket = cpu_socket;
Why did you remove it? Shouldn't you use pal_control->topo_info
to populate these fields?
Pal/src/host/Linux-common/topo_info.c, line 27 at r3 (raw file):
} if(size_mult)
Add a space after if
Pal/src/host/Linux-common/topo_info.c, line 76 at r3 (raw file):
} /* caller wants to count the number of HW resources */
This comment is now stale. It's not just counting the number of resources, but also populating the res_info
array.
Pal/src/host/Linux-common/topo_info.c, line 128 at r3 (raw file):
res_info->ranges[res_info->range_count - 1].start = firstint; res_info->ranges[res_info->range_count - 1].end = secondint; }
The two new code snippets seem to be almost exactly the same. Cannot we move them out to the outer scope and merge into one?
Pal/src/host/Linux-common/topo_info.c, line 209 at r3 (raw file):
snprintf(filename, sizeof(filename), "/sys/devices/system/cpu/cpu%d/cache/index%d/shared_cpu_list", core_idx, lvl); ret = get_hw_resource(filename, /*count=*/true, &core_cache[lvl].shared_cpu_map, NULL);
Hm... Maybe we should rename shared_cpu_map
struct field to shared_cpu_list
? I don't have a strong opinion on this, so feel free to ignore.
Pal/src/host/Linux-common/topo_info.c, line 209 at r3 (raw file):
snprintf(filename, sizeof(filename), "/sys/devices/system/cpu/cpu%d/cache/index%d/shared_cpu_list", core_idx, lvl); ret = get_hw_resource(filename, /*count=*/true, &core_cache[lvl].shared_cpu_map, NULL);
Could you please add argument names in all these get_hw_resource()
invokations? In particular, here it will be:
get_hw_resource(filename, /*count=*/true, &core_cache[lvl].shared_cpu_map, /*size_mult=*/NULL);
...and everywhere else please.
Pal/src/host/Linux-common/topo_info.c, line 223 at r3 (raw file):
snprintf(filename, sizeof(filename), "/sys/devices/system/cpu/cpu%d/cache/index%d/type", core_idx, lvl); READ_FILE_BUFFER(filename, type, /*failure_label=*/out_cache);
It looks like this READ_FILE_BUFFER
macro is almost not used. Can we replace it with explicit code then?
Pal/src/host/Linux-common/topo_info.c, line 289 at r3 (raw file):
int possible_logical_cores = topo_info->possible_logical_cores.resource_count; /* TODO: correctly support offline cores */ if (possible_logical_cores > 0 && possible_logical_cores > online_logical_cores) {
Why this check possible_logical_cores > 0
? How can possible_logical_cores
be zero or less?
Pal/src/host/Linux-common/topo_info.c, line 290 at r3 (raw file):
/* TODO: correctly support offline cores */ if (possible_logical_cores > 0 && possible_logical_cores > online_logical_cores) { log_warning("some CPUs seem to be offline; Graphene doesn't take this into account which "
some
-> Some
Graphene
-> Gramine
Pal/src/host/Linux-common/topo_info.c, line 321 at r3 (raw file):
if (ret < 0) goto out_topology; core_topology[idx].core_id = ret;
You have double-space
Pal/src/host/Linux-common/topo_info.c, line 340 at r3 (raw file):
if (ret < 0) goto out_topology; core_topology[idx].cpu_socket = ret;
You have double-space
Pal/src/host/Linux-common/topo_info.c, line 343 at r3 (raw file):
if (ret > current_max_socket) { current_max_socket = ret; num_sockets++;
First, this is wrong: if your first CPU (cpu0
) belongs to socket 2, you will immediately have current_max_socket = 2
, and you'll never increase num_sockets
again. So you'll end up with num_sockets = 1
, which will be wrong.
Second, why do you even need num_sockets
? Isn't current_max_socket
already give you the total number of sockets available? It feels like topo_info->num_sockets = current_max_socket + 1
is sufficient.
Pal/src/host/Linux-common/topo_info.c, line 353 at r3 (raw file):
topo_info->num_sockets = num_sockets; topo_info->physical_cores_per_socket = core_topology[0].core_siblings.resource_count / core_topology[0].thread_siblings.resource_count;
Could it be that core_topology[0].thread_siblings.resource_count == 0
?
I just don't want a divison-by-zero error in Gramine. If this is possible, please add an additional check. If it is not possible, feel free to ignore my comment.
Pal/src/host/Linux-SGX/db_main.c, line 339 at r2 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Currently for socket info, sysfs just tells us which core is part of which socket. We get this from,
/sys/devices/system/cpu/cpuX/topology/physical_package_id
. Something likeCPU0->socket 1.... CPU32->socket 2
. But I am converting this info to a bitmap, something likesocket[0] -> 0-1, 3-5, 5-6, 7-8....31
andsocket[1]->33-34, 36-37, 49-40....63
so that I can compare this against thecore_sibling
which basically tells all the cores within a socket.So, I introduced this
PAL_SOCKET_INFO
struct which is a little different thanPAL_RANGE_INFO
which only hasstart
andend
of range. Here we also need to keep track of therange_count
as it is unique to each socket. We could have usedPAL_RANGE_INFO
struct but this also has resource count which we don't need here.This struct is only used for sanitizing the sockets. How about I declare this struct inside the function?
I see. Declaring a new type just to save space for one field doesn't look good -- it raises eyebrows and hurts readability.
Let's just use PAL_RES_RANGE_INFO
instead of this one.
Pal/src/host/Linux-SGX/db_main.c, line 142 at r3 (raw file):
if (!ranges) { log_error("Range allocation failed"); return -1;
Please use PAL error codes here. return -PAL_ERROR_NOMEM
Pal/src/host/Linux-SGX/db_main.c, line 148 at r3 (raw file):
range_cnt * sizeof(PAL_RANGE_INFO))) { log_error("Copying ranges into the enclave failed"); return -1;
return -PAL_ERROR_DENIED
Also, you leak memory, please free(ranges)
Pal/src/host/Linux-SGX/db_main.c, line 157 at r3 (raw file):
} /* This fucntion does the following 3 sanitizations for a given resource range:
function
Pal/src/host/Linux-SGX/db_main.c, line 171 at r3 (raw file):
if (!IS_IN_RANGE_INCL(resource_count, res_min_limit, res_max_limit)) { log_error("Invalid resource count: %ld", resource_count); return -1;
I just noticed that we have weird and inconsistent error codes in these functions: sometimes you return -1
, sometimes -EINVAL
(UNIX error codes), and never return -PAL_ERROR_INVAL
(PAL error codes). Why so?
Please unify on either UNIX error codes or PAL error codes. I wonder why we had this inconsistency before...
Pal/src/host/Linux-SGX/db_main.c, line 178 at r3 (raw file):
int64_t range_count = (int64_t)res_info.range_count; if (!IS_IN_RANGE_INCL(range_count, 1, 1 << 7)) { log_error("Invalid range count: %ld\n", range_count);
log_*
functions already add \n
at the end. Please remove \n
here and everywhere else.
Pal/src/host/Linux-SGX/db_main.c, line 195 at r3 (raw file):
end = start; else if (res_info.ranges[i].end <= INT64_MAX) end = (int64_t)res_info.ranges[i].end;
Double space
Pal/src/host/Linux-SGX/db_main.c, line 206 at r3 (raw file):
if ((start != end) && !IS_IN_RANGE_INCL(end, start + 1, range_max_limit)) { log_error("Invalid end range: %ld\n", start);
start
-> end
Pal/src/host/Linux-SGX/db_main.c, line 212 at r3 (raw file):
/* check for malformed ranges */ if (previous_end >= end) { log_error("Malformed range: previous_end =%ld, current end = %ld\n", previous_end, end);
previous_end =%ld
-> previous_end = %ld
(add a space)
Pal/src/host/Linux-SGX/db_main.c, line 244 at r3 (raw file):
if (cache[lvl].level > INT64_MAX) return -1;
You have a bunch of lines like this. Let's just merge them into one big IF for readability:
if (cache[lvl].level > INT64_MAX || cache[lvl].size || cache[lvl].coherency_line_size ..)
return -EINVAL;
Pal/src/host/Linux-SGX/db_main.c, line 256 at r3 (raw file):
} int64_t multiplier;
You can just init to = 1
and remove the last else
Pal/src/host/Linux-SGX/db_main.c, line 304 at r3 (raw file):
if (idx != 0) { /* core 0 is always online */ if (core_topology[idx].is_logical_core_online > INT64_MAX) return -1;
Wrong indentation
Pal/src/host/Linux-SGX/db_main.c, line 311 at r3 (raw file):
} if (core_topology[idx].core_id > INT64_MAX)
You can merge this clause into the previous IF statement and remove these two lines.
Pal/src/host/Linux-SGX/db_main.c, line 366 at r3 (raw file):
* Note: Although a clever attacker might modify both of these values, idea here is to * provide a consistent view of the topology. */ int64_t range_count;
Why is this helper var declared here? You use it in both branches locally, so just declare it there two times. Also, the name range_count
is misleading. It is more like range_idx
or socket_range_idx
.
Pal/src/host/Linux-SGX/db_main.c, line 387 at r3 (raw file):
} /* Verify against core-siblings */
I got lost in this loop. Could you expand your comment: what do you verify against what?
Pal/src/host/Linux-SGX/db_main.c, line 412 at r3 (raw file):
} }
Please add ret = 0
here, for sanity
Pal/src/host/Linux-SGX/db_main.c, line 414 at r3 (raw file):
out_socket: for (int64_t i =0 ; i < num_sockets; i++) {
i =0 ;
-> i = 0;
Pal/src/host/Linux-SGX/db_main.c, line 447 at r3 (raw file):
int64_t index = j / (sizeof(int) * BITS_IN_BYTE); if (index >= num_cpumask) { ret = -1;
Please don't use -1
as return value.
Pal/src/host/Linux-SGX/db_main.c, line 451 at r3 (raw file):
} if (bitmap[index] >> (j % (sizeof(int) * BITS_IN_BYTE)) == 1) { log_error("Invalid numa_topology! Core: %ld found in multiple numa nodes", j);
Slightly better message: Invalid NUMA topology: core %ld found in multiple numa nodes
Pal/src/host/Linux-SGX/db_main.c, line 469 at r3 (raw file):
if (total_cores_in_numa != num_cores) { log_error("Invalid numa_topology! More cores in NUMA than online");
Slightly better message: Invalid NUMA topology: more cores in NUMA than online
Pal/src/host/Linux-SGX/db_main.c, line 495 at r3 (raw file):
int ret = copy_hw_resource_range(&src[idx].core_siblings, &core_topo[idx].core_siblings); if (ret < 0) { log_error("Copying core_topo[%ld].core_siblings failed", idx);
You leek memory. You should free(core_topo)
Pal/src/host/Linux-SGX/db_main.c, line 501 at r3 (raw file):
ret = copy_hw_resource_range(&src[idx].thread_siblings, &core_topo[idx].thread_siblings); if (ret < 0) { log_error("Copying core_topo[%ld].core_siblings failed", idx);
You leek memory. You should free(core_topo)
Pal/src/host/Linux-SGX/db_main.c, line 509 at r3 (raw file):
sizeof(PAL_CORE_CACHE_INFO)); if (!cache_info) { log_error("Allocation for cache_info failed");
You leek memory. You should free(core_topo)
Pal/src/host/Linux-SGX/db_main.c, line 525 at r3 (raw file):
&cache_info[lvl].shared_cpu_map); if (ret < 0) { log_error("Copying core_topo[%ld].cache[%ld].shared_cpu_map failed", idx, lvl);
You leek memory. You should free(core_topo)
and free(cache_info)
Pal/src/host/Linux-SGX/db_main.c, line 550 at r3 (raw file):
int ret = copy_hw_resource_range(&src[idx].cpumap, &numa_topo[idx].cpumap); if (ret < 0) { log_error("Copying numa_topo[%ld].core_siblings failed", idx);
leak memory
Pal/src/host/Linux-SGX/db_main.c, line 556 at r3 (raw file):
ret = copy_hw_resource_range(&src[idx].distance, &numa_topo[idx].distance); if (ret < 0) { log_error("Copying numa_topo[%ld].core_siblings failed", idx);
leak memory
Pal/src/host/Linux-SGX/db_main.c, line 566 at r3 (raw file):
/* This function doesn't clean up resources on failure, assuming that we terminate right away in * such case. */
Ah, is this why you're not freeing memory anywhere? You can add some kind of a top-level comment that /* All sanitization functions below do not free memory ...*/
Pal/src/host/Linux-SGX/db_misc.c, line 730 at r3 (raw file):
ci->online_logical_cores = g_pal_sec.online_logical_cores; ci->physical_cores_per_socket = g_pal_sec.physical_cores_per_socket; ci->cpu_socket = g_pal_sec.cpu_socket;
Why is this removed? I understand that you removed these fields from g_pal_sec
but aren't you supposed to get these fields from your g_pal_sec.topo_info.*
fields?
Pal/src/host/Linux-SGX/db_misc.c, line 789 at r3 (raw file):
if (!ranges) { log_error("Range allocation failed"); return -1;
Please return PAL error codes here. In this case, you should do return -PAL_ERROR_NOMEM
Pal/src/host/Linux-SGX/db_misc.c, line 804 at r3 (raw file):
if (ret < 0) { log_error("Copying g_pal_sec.topo_info.online_logical_cores failed"); return -1;
return ret
Pal/src/host/Linux-SGX/db_misc.c, line 811 at r3 (raw file):
if (ret < 0) { log_error("Copying g_pal_sec.topo_info.possible_logical_cores failed"); return -1;
ditto
Pal/src/host/Linux-SGX/db_misc.c, line 811 at r3 (raw file):
if (ret < 0) { log_error("Copying g_pal_sec.topo_info.possible_logical_cores failed"); return -1;
You actually leak memory here (topo_info->online_logical_cores.ranges
was allocated in the previous lines and never freed). On the other hand, I don't care about this scenario that much -- if we fail here, then we are out-of-memory and will die soon anyway.
So whatever you prefer: either add some comment at the top of this function that it leaks memory on failures, or free explicitly here.
Pal/src/host/Linux-SGX/db_misc.c, line 817 at r3 (raw file):
if (ret < 0) { log_error("Copying g_pal_sec.topo_info.nodes failed"); return -1;
ditto (return PAL error code; leaking memory)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 71 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @dimakuv and @vijaydhanraj)
a discussion (no related file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Given your other PR on
getcpu()
support, what should be the order of merging this PRs?
I prefer to merge in the following order,
- Refactor sysfs to improve sanitization (as we do have some TODOs in
getcpu()
PR with respect to sanitization which this one will address) - Add checkpoint-and-restore support for sysfs
- Finally,
getcpu()
PR.
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Actually, I still don't understand step 2. Aren't you supposed to
2. Sanitize CPU/NUMA information based on untrusted integers
? I mean, it's not aboutsysfs paths
because at this point you don't have any paths -- you just have an internal representation of the CPU/NUMA topology which is represented as a bunch of integers.Please reword.
I meant sanitization of untrusted integers read from sysfs paths, but I see that it isn't clear. Will make it more explicit as you have suggested.
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
This is a strange message. What you do here is add a new
realloc_size()
function. I don't see how adding something new is "exposing internal implementation". Am I missing something? Do you have tworealloc_size()
functions now? If yes, then why?
Yes, we do. One is defined statically here, https://github.com/gramineproject/graphene/blob/master/LibOS/shim/src/fs/proc/info.c#L57. I tried to expose this internal function to Linux-SGX pal
as well. How about we change to Add realloc API support to Linux-SGX pal
?
LibOS/shim/include/shim_fs_pseudo.h, line 211 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
I see. Yes, please add an assert then.
Done.
LibOS/shim/include/shim_fs_pseudo.h, line 212 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Thanks. No, it's not about saving stack space (which of course is also true, but not the main point). We do this because that's the norm in C world, so the C developer who will read
PAL_RES_RANGE_INFO res_range_info
in 5 years from now will immediatelly assume that the typePAL_RES_RANGE_INFO
is a typedef for some "pointer to some struct", and thatres_range_info
was passed by reference. Of course, quite quickly the developer will figure out this is not the case and will write us some hate mails :)Basically, this change is to stick to the principle of least surprise. 99.9% of the C code is written this way, and there is no reason in this case to be in that 0.01%.
I see, so the assumption is that if it is a typedef
struct then it is generally a "pointer to some struct".
LibOS/shim/include/shim_fs_pseudo.h, line 216 at r3 (raw file):
It should be sys_convert_ranges_to_str (notice the plural).
It is not necessary that we always will have multiple ranges. For example, possible_online_cores
will just have a single range. Another example is numa online
. But have reworded the comment and changed range
->ranges
LibOS/shim/include/shim_fs_pseudo.h, line 218 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Use the plural in function name (
ranges
)
Done.
LibOS/shim/include/shim_fs_pseudo.h, line 219 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
By the way, what is the need for
sep
? Isn't it always,
(comma)?
yes, almost everywhere except when representing NUMA distance which is of form 10 20
where we have a space separator.
LibOS/shim/include/shim_fs_pseudo.h, line 221 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Why do you say
ranges
? Isn't it just one range that you convert? If it's just one range, why do you sayranges.start
/ranges.end
?
please refer to the earlier comment on this.
LibOS/shim/src/fs/sys/cache_info.c, line 57 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
I quite like this new code! Very readable.
thanks :)
LibOS/shim/src/fs/sys/cpu_info.c, line 16 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
No reason to
= 0
. But not blocking.
Done.
LibOS/shim/src/fs/sys/fs.c, line 47 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Ok, two things.
Why magic value of
UINT64_MAX
? If you have a range consisting of only one value, then you can always specifyranges[0].start = ranges[0].end = 20
, which gives you the single value20
. I highly dislike this magic value, when there is a human-readable alternative.I still don't see a need in
end_str
. You could simply do:if (res_range_info->ranges[i].start == res_range_info->ranges[i].end) { // print a single value snprintf(str + offset, max_len - offset, "%lu%s", res_range_info->ranges[i].start, (i + 1 == range_cnt) ? "\0" : sep); } else { // print a range snprintf(str + offset, max_len - offset, "%lu-%lu%s", res_range_info->ranges[i].start, res_range_info->ranges[i].end, (i + 1 == range_cnt) ? "\0" : sep); }
Removed end marker UINT64_MAX
and also removed end_str
helper variable.
LibOS/shim/src/fs/sys/fs.c, line 44 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Why so weird? Why not
if (offset > max_len)
? This is much more readable ("if current offset exceeds the maximum length").This also allows you to change the types from
int
touint64_t
everywhere.
Done.
LibOS/shim/src/fs/sys/fs.c, line 69 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Why do you have this check here? Shouldn't this check be in the code where you init
g_pal_control->topo_info.possible_logical_cores.resource_count
?
Only inside the enclave do we sanitize the values and ensure that the value falls within a sane range. But for non-sgx cases, such sanitization is not done and so added this check.
LibOS/shim/src/fs/sys/fs.c, line 76 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Please use
uint32_t
instead ofunsigned int
Done.
LibOS/shim/src/fs/sys/fs.c, line 91 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Don't you have
BITS_IN_INT
to replace this(sizeof(int) * BITS_IN_BYTE)
?
no, I have BITS_TO_INTS
macro which is used to calculate the number of integers required to represent a given value which is different than this case.
LibOS/shim/src/fs/sys/fs.c, line 96 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
ditto
please see earlier comment.
LibOS/shim/src/fs/sys/fs.c, line 103 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
ditto (use more straight-forward check)
Done.
LibOS/shim/src/fs/sys/node_info.c, line 17 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Please move to the very beginning of this function. Also,
= 0
is not needed here.
Done.
LibOS/shim/src/fs/sys/node_info.c, line 52 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
You should also add
else { ... failed to recognize hugepages ... }
here.
Done.
Pal/include/arch/x86_64/pal-arch.h, line 294 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Wait, how is this "self-explanatory"? When I look at this file
pal-arch.h
, I have no idea what these constants mean. As a future reader of this code, I shouldn't grep our whole code base to realize that these constants are used to describe CPU cache types. AndUNIFIED
keyword can mean anything... I still wantCACHE_TYPE_
prefix here (or whatever you think better describes it).
OK, added CACHE_TYPE
prefix.
Pal/include/arch/x86_64/pal-arch.h, line 35 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
No need to capitalize the word
macro
. Also, you can shorten to just:NOTE: Used to allocate on stack; increase with caution or use malloc instead.
Done.
Pal/include/arch/x86_64/pal-arch.h, line 39 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
ditto
Done.
Pal/include/arch/x86_64/pal-arch.h, line 309 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
No need for this magic value. If
start == end
, then it's not a range but a single value.
removed this comment as we now use start == end
check.
Pal/include/arch/x86_64/pal-arch.h, line 315 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Please remove trailing
,
in the example
Done.
Pal/include/arch/x86_64/pal-arch.h, line 332 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Actually, I don't understand how this comment concerns this struct. Did it get stale?
No, it is not stale. We update this variable by reading /sys/devices/system/cpu/cpuX/online
file but for cpu0
there is no such file as it is always online. Please see here, https://github.com/gramineproject/graphene/pull/2661/files#diff-713ce9dc6240f95fd6e13e21ce6a56d6da07627ea2329a200c062e750696ae52R308.
Pal/include/host/Linux-common/topo_info.h, line 14 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
size_multiplier
->size_mult
are
->is
Also,
size qualifier like "K"/"M"/"G"
sounds like you will save the string intosize_mult
Done and reworded.
Pal/src/host/Linux/arch/x86_64/db_arch_misc.c, line 167 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Why did you remove it? Shouldn't you use
pal_control->topo_info
to populate these fields?
I have consolidated all topology related info to PAL_TOPO_INFO
. This will make things easier when introducing checkpoint and restore code for sysfs.
Pal/src/host/Linux-common/topo_info.c, line 27 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Add a space after
if
Done.
Pal/src/host/Linux-common/topo_info.c, line 76 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
This comment is now stale. It's not just counting the number of resources, but also populating the
res_info
array.
Updated the comment.
Pal/src/host/Linux-common/topo_info.c, line 128 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
The two new code snippets seem to be almost exactly the same. Cannot we move them out to the outer scope and merge into one?
yes merged.
Pal/src/host/Linux-common/topo_info.c, line 209 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Hm... Maybe we should rename
shared_cpu_map
struct field toshared_cpu_list
? I don't have a strong opinion on this, so feel free to ignore.
We do read shared_cpu_list
but we use this to generate shared_cpu_map
so named it this way.
Pal/src/host/Linux-common/topo_info.c, line 209 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Could you please add argument names in all these
get_hw_resource()
invokations? In particular, here it will be:get_hw_resource(filename, /*count=*/true, &core_cache[lvl].shared_cpu_map, /*size_mult=*/NULL);
...and everywhere else please.
Done.
Pal/src/host/Linux-common/topo_info.c, line 223 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
It looks like this
READ_FILE_BUFFER
macro is almost not used. Can we replace it with explicit code then?
yes removed the macro.
Pal/src/host/Linux-common/topo_info.c, line 289 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Why this check
possible_logical_cores > 0
? How canpossible_logical_cores
be zero or less?
Yes, this check is not needed now. Removed it.
Pal/src/host/Linux-common/topo_info.c, line 290 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
some
->Some
Graphene
->Gramine
Done.
Pal/src/host/Linux-common/topo_info.c, line 321 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
You have double-space
Done.
Pal/src/host/Linux-common/topo_info.c, line 340 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
You have double-space
Done.
Pal/src/host/Linux-common/topo_info.c, line 343 at r3 (raw file):
First, this is wrong: if your first CPU (cpu0) belongs to socket 2
Can Linux have socket assignments like this? IIUC, on 128 core system with HT, the socket assignment will be 0-31-> socket 0, 32-63->socket 1, 64-95 (siblings of 0-31)-> socket 0, 96-127 (siblings of 32-63)->socket 1. Linux places the cores in increasing order of sockets so this shouldn't be an issue.
It feels like topo_info->num_sockets = current_max_socket + 1 is sufficient.
yes, we can use current_max_socket
itself instead of using a new variable.
Pal/src/host/Linux-common/topo_info.c, line 353 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Could it be that
core_topology[0].thread_siblings.resource_count == 0
?I just don't want a divison-by-zero error in Gramine. If this is possible, please add an additional check. If it is not possible, feel free to ignore my comment.
No, we update resource_count
only if it is greater than 0 else we fail. Please see here, https://github.com/gramineproject/graphene/pull/2661/files#diff-713ce9dc6240f95fd6e13e21ce6a56d6da07627ea2329a200c062e750696ae52R137
Pal/src/host/Linux-SGX/db_main.c, line 339 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
I see. Declaring a new type just to save space for one field doesn't look good -- it raises eyebrows and hurts readability.
Let's just use
PAL_RES_RANGE_INFO
instead of this one.
OK, done.
Pal/src/host/Linux-SGX/db_main.c, line 142 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Please use PAL error codes here.
return -PAL_ERROR_NOMEM
Please refer to the other relating comment.
Pal/src/host/Linux-SGX/db_main.c, line 148 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
return -PAL_ERROR_DENIED
Also, you leak memory, please
free(ranges)
Please refer to the other relating comment. Also, if we fail, we exit and so don't free resources. We have this comment, https://github.com/gramineproject/graphene/blob/master/Pal/src/host/Linux-SGX/db_main.c#L357
Pal/src/host/Linux-SGX/db_main.c, line 157 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
function
Done.
Pal/src/host/Linux-SGX/db_main.c, line 171 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
I just noticed that we have weird and inconsistent error codes in these functions: sometimes you return
-1
, sometimes-EINVAL
(UNIX error codes), and never return-PAL_ERROR_INVAL
(PAL error codes). Why so?Please unify on either UNIX error codes or PAL error codes. I wonder why we had this inconsistency before...
The reason, I have -1
is because we don't pass the return value back to the caller but merely use the return value to check error and exit. Please see here, https://github.com/gramineproject/graphene/blob/master/Pal/src/host/Linux-SGX/db_main.c#L650
To avoid any confusion, I have changed every failure to return -1
.
Pal/src/host/Linux-SGX/db_main.c, line 178 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
log_*
functions already add\n
at the end. Please remove\n
here and everywhere else.
Done.
Pal/src/host/Linux-SGX/db_main.c, line 195 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Double space
Done.
Pal/src/host/Linux-SGX/db_main.c, line 206 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
start
->end
Done.
Pal/src/host/Linux-SGX/db_main.c, line 212 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
previous_end =%ld
->previous_end = %ld
(add a space)
Done.
Pal/src/host/Linux-SGX/db_main.c, line 244 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
You have a bunch of lines like this. Let's just merge them into one big IF for readability:
if (cache[lvl].level > INT64_MAX || cache[lvl].size || cache[lvl].coherency_line_size ..) return -EINVAL;
Currently, we do check for overflow for each variable and then assign the value. Bunching it together makes one wonder why we want to check against INT64_MAX
at the beginning. So have left it as is. But if you still prefer to merge, will do so.
Pal/src/host/Linux-SGX/db_main.c, line 256 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
You can just init to
= 1
and remove the lastelse
Done.
Pal/src/host/Linux-SGX/db_main.c, line 304 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Wrong indentation
Done.
Pal/src/host/Linux-SGX/db_main.c, line 311 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
You can merge this clause into the previous IF statement and remove these two lines.
Let us not do this but have the overflow checks right before the assignment. This is better and aligns with how we declare/define the variable right before its usage.
Pal/src/host/Linux-SGX/db_main.c, line 366 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Why is this helper var declared here? You use it in both branches locally, so just declare it there two times. Also, the name
range_count
is misleading. It is more likerange_idx
orsocket_range_idx
.
Done.
Pal/src/host/Linux-SGX/db_main.c, line 387 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
I got lost in this loop. Could you expand your comment: what do you verify against what?
core-siblings represent all the cores that are part of a socket. We cross-verify the socket info against this. Also, updated the comment.
Pal/src/host/Linux-SGX/db_main.c, line 412 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Please add
ret = 0
here, for sanity
Sorry, I don't understand. We do set ret = 0
at the beginning of the function and if we get a failure, immediately bail out. Please clarify.
Pal/src/host/Linux-SGX/db_main.c, line 414 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
i =0 ;
->i = 0;
Done.
Pal/src/host/Linux-SGX/db_main.c, line 447 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Please don't use
-1
as return value.
Please see my earlier comment on this.
Pal/src/host/Linux-SGX/db_main.c, line 451 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Slightly better message:
Invalid NUMA topology: core %ld found in multiple numa nodes
Done.
Pal/src/host/Linux-SGX/db_main.c, line 469 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Slightly better message:
Invalid NUMA topology: more cores in NUMA than online
Done.
Pal/src/host/Linux-SGX/db_main.c, line 495 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
You leek memory. You should
free(core_topo)
Please see the earlier comment on this.
Pal/src/host/Linux-SGX/db_main.c, line 501 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
You leek memory. You should
free(core_topo)
Please see the earlier comment on this.
Pal/src/host/Linux-SGX/db_main.c, line 509 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
You leek memory. You should
free(core_topo)
Please see the earlier comment on this.
Pal/src/host/Linux-SGX/db_main.c, line 525 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
You leek memory. You should
free(core_topo)
andfree(cache_info)
Please see the earlier comment on this.
Pal/src/host/Linux-SGX/db_main.c, line 550 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
leak memory
Please see the earlier comment on this.
Pal/src/host/Linux-SGX/db_main.c, line 556 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
leak memory
Please see the earlier comment on this.
Pal/src/host/Linux-SGX/db_main.c, line 566 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Ah, is this why you're not freeing memory anywhere? You can add some kind of a top-level comment that
/* All sanitization functions below do not free memory ...*/
Added this comment /* All sanitization functions below do not free memory. We simply exit on failure. */
at the beginning of the sanitization functions.
Pal/src/host/Linux-SGX/db_misc.c, line 730 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Why is this removed? I understand that you removed these fields from
g_pal_sec
but aren't you supposed to get these fields from yourg_pal_sec.topo_info.*
fields?
So, these variables were used when generating info for /proc/cpuinfo
. Now, I do use topo_info
to extract this data. Please see here, https://github.com/gramineproject/graphene/pull/2661/files#diff-11a071b23ae1c595b667fdd7e80bcf9e1d62866e11fb91b5b3efbdd7a4224409R128.
Please clarify if I misunderstood.
Pal/src/host/Linux-SGX/db_misc.c, line 789 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Please return PAL error codes here. In this case, you should do
return -PAL_ERROR_NOMEM
Please see the earlier comment on this.
Pal/src/host/Linux-SGX/db_misc.c, line 804 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
return ret
Please see the earlier comment on this.
Pal/src/host/Linux-SGX/db_misc.c, line 811 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
ditto
Please see the earlier comment on this.
Pal/src/host/Linux-SGX/db_misc.c, line 811 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
You actually leak memory here (
topo_info->online_logical_cores.ranges
was allocated in the previous lines and never freed). On the other hand, I don't care about this scenario that much -- if we fail here, then we are out-of-memory and will die soon anyway.So whatever you prefer: either add some comment at the top of this function that it leaks memory on failures, or free explicitly here.
I have added a comment at top of sanitization functions.
Pal/src/host/Linux-SGX/db_misc.c, line 817 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
ditto (return PAL error code; leaking memory)
Please see the earlier comment on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 72 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @dimakuv and @vijaydhanraj)
a discussion (no related file):
@vijaydhanraj Did you forget to push the new changes?
0dc6f10
to
8f7fa8f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 11 of 20 files reviewed, 72 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @vijaydhanraj)
a discussion (no related file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
@vijaydhanraj Did you forget to push the new changes?
oh, I am sorry, just forced pushed the changes since we changed commit messages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 9 of 9 files at r4, all commit messages.
Reviewable status: all files reviewed, 31 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @vijaydhanraj)
a discussion (no related file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
I prefer to merge in the following order,
- Refactor sysfs to improve sanitization (as we do have some TODOs in
getcpu()
PR with respect to sanitization which this one will address)- Add checkpoint-and-restore support for sysfs
- Finally,
getcpu()
PR.
Got it. Let me block getcpu
PR then on this.
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Yes, we do. One is defined statically here, https://github.com/gramineproject/graphene/blob/master/LibOS/shim/src/fs/proc/info.c#L57. I tried to expose this internal function to
Linux-SGX pal
as well. How about we change toAdd realloc API support to Linux-SGX pal
?
Ah, I see. Just call this commit
[Pal] Add `realloc_size()` in PAL slab manager
LibOS/shim/include/shim_fs_pseudo.h, line 212 at r2 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
I see, so the assumption is that if it is a
typedef
struct then it is generally a "pointer to some struct".
Yes. The emphasis is though on the word struct
not on the word typedef
.
E.g., typedef uint8_t char
is perfectly fine. But typedef struct { ... }
-- here devs expect that this is passed as a pointer to functions.
LibOS/shim/include/shim_fs_pseudo.h, line 219 at r3 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
yes, almost everywhere except when representing NUMA distance which is of form
10 20
where we have a space separator.
OK, got it.
LibOS/shim/include/shim_fs_pseudo.h, line 214 at r4 (raw file):
/* Converts integer to a string. If size multiplier like 'K', 'M' or 'G' is passed, it is appended * to the string. */ int sys_convert_int_to_str(uint64_t val, uint64_t size_mult, char* str, size_t max_len);
Please replace max_len
with size
(or max_size
).
You're passing not string lengths but string sizes in all call sites (e.g., you pass sizeof(str)
). The name len
confused me.
LibOS/shim/include/shim_fs_pseudo.h, line 218 at r4 (raw file):
/* Converts array of integer range(s) to a string. For example if ranges[0].start and ranges[0].end * were 0 and 63 respectively, then `0-63` string is generated. */ int sys_convert_ranges_to_str(const PAL_RES_RANGE_INFO* res_range_info, char* str, size_t max_len,
Please replace max_len
with size
(or max_size
).
You're passing not string lengths but string sizes in all call sites (e.g., you pass sizeof(str)
). The name len
confused me.
LibOS/shim/include/shim_fs_pseudo.h, line 224 at r4 (raw file):
* ranges[0].end were 0 and 31 respectively, then `00000000,ffffffff` string is generated. */ int sys_convert_ranges_to_cpu_bitmap_str(const PAL_RES_RANGE_INFO* res_range_info, char* str, size_t max_len);
Please replace max_len
with size
(or max_size
).
You're passing not string lengths but string sizes in all call sites (e.g., you pass sizeof(str)
). The name len
confused me.
LibOS/shim/src/fs/sys/fs.c, line 91 at r3 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
no, I have
BITS_TO_INTS
macro which is used to calculate the number of integers required to represent a given value which is different than this case.
Well, true. But you have BITS_IN_TYPE
, you can use it here.
LibOS/shim/src/fs/sys/fs.c, line 96 at r3 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
please see earlier comment.
ditto (use BITS_IN_TYPE
)
LibOS/shim/src/fs/sys/fs.c, line 19 at r4 (raw file):
int sys_convert_int_to_str(uint64_t val, uint64_t size_mult, char* str, size_t max_len) { int ret = 0; assert(max_len > 0);
This is useless now, because max_len
is of unsigned type. Please remove.
LibOS/shim/src/fs/sys/fs.c, line 46 at r4 (raw file):
size_t offset = 0; for (int64_t i = 0; i < range_cnt; i++) { if (offset > max_len)
It should be offset >= max_len
. Otherwise your loop will become infinite.
LibOS/shim/src/fs/sys/fs.c, line 70 at r4 (raw file):
size_t max_len) { if (g_pal_control->topo_info.possible_logical_cores.resource_count > INT64_MAX) return -1;
-EINVAL
LibOS/shim/src/fs/sys/fs.c, line 81 at r4 (raw file):
if (res_range_info->range_count > INT64_MAX) return -EINVAL;
You leak bitmap
here. Simply move this check above the calloc
code.
LibOS/shim/src/fs/sys/fs.c, line 87 at r4 (raw file):
uint64_t end = res_range_info->ranges[i].end; if (start > INT64_MAX || end > INT64_MAX) return -EINVAL;
You leak bitmap
here
LibOS/shim/src/fs/sys/fs.c, line 101 at r4 (raw file):
size_t offset = 0; for (int64_t j = num_cpumask - 1; j >= 0; j-- ) { if (offset > max_len) {
It should be offset >= max_len
. Otherwise your loop will become infinite.
LibOS/shim/src/fs/sys/fs.c, line 105 at r4 (raw file):
goto out; } int ret = snprintf(str + offset, max_len - offset, "%x%x%x%x%x%x%x%x%s",
You re-define ret
here. This will lead to undefined behavior because the outer-scope ret
(which is returned in the out
label) will not be initialized...
Please remove the int
word basically.
LibOS/shim/src/fs/sys/node_info.c, line 27 at r4 (raw file):
if (ret < 0) return ret;
This looks ugly... Why not just:
if (strcmp(name, "online") != 0) {
log_debug("unrecognized file: %s", name);
return -ENOENT;
}
ret = sys_convert_ranges_to_str(&g_pal_control->topo_info.nodes, str, sizeof(str), ",");
if (ret < 0)
return ret;
return sys_load(str, out_data, out_size);
LibOS/shim/src/fs/sys/node_info.c, line 54 at r4 (raw file):
str, sizeof(str)); } else { log_debug("unrecognized hugepage file: %s", parent_name);
Where is ret = -ENOENT
in this case?
Pal/src/host/Linux/arch/x86_64/db_arch_misc.c, line 167 at r3 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
I have consolidated all topology related info to
PAL_TOPO_INFO
. This will make things easier when introducing checkpoint and restore code for sysfs.
Ah, got it. You moved this stuff from this function _DkGetCPUInfo()
to _DkGetTopologyInfo()
.
Pal/src/host/Linux-common/topo_info.c, line 343 at r3 (raw file):
Can Linux have socket assignments like this?
Yes, I don't think a benign Linux host won't do assignments like this. But we should also protect against a malicious Linux host, right?
Anyways, resolving this one since you fixed it.
Pal/src/host/Linux-SGX/db_main.c, line 171 at r3 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
The reason, I have
-1
is because we don't pass the return value back to the caller but merely use the return value to check error and exit. Please see here, https://github.com/gramineproject/graphene/blob/master/Pal/src/host/Linux-SGX/db_main.c#L650To avoid any confusion, I have changed every failure to return
-1
.
OK, got it.
Pal/src/host/Linux-SGX/db_main.c, line 178 at r3 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Done.
Not done?
Pal/src/host/Linux-SGX/db_main.c, line 244 at r3 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Currently, we do check for overflow for each variable and then assign the value. Bunching it together makes one wonder why we want to check against
INT64_MAX
at the beginning. So have left it as is. But if you still prefer to merge, will do so.
Hm, ok. Let's keep your logic.
Pal/src/host/Linux-SGX/db_main.c, line 256 at r3 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Done.
Almost, forgot to add a space (=1
-> = 1
)
Pal/src/host/Linux-SGX/db_main.c, line 412 at r3 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Sorry, I don't understand. We do set
ret = 0
at the beginning of the function and if we get a failure, immediately bail out. Please clarify.
That's a typical way of writing such code snippets. It conveys that "If I got to this point in the function, I consider it a success".
Your current way is syntactically equivalent but conveys a more contrived logic: "I first assume that the function will end successfully, and if something fails, I will immediately bail out; so if I got to this point in the function, I know that my initial assumption indeed was correct".
In other words, it's just that it's easier to read.
Pal/src/host/Linux-SGX/db_main.c, line 414 at r3 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Done.
Not completely done. i = 0 ;
-> i = 0;
Pal/src/host/Linux-SGX/db_main.c, line 566 at r3 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Added this comment
/* All sanitization functions below do not free memory. We simply exit on failure. */
at the beginning of the sanitization functions.
You can remove this comment here now.
Pal/src/host/Linux-SGX/db_main.c, line 157 at r4 (raw file):
} /* All sanitization functions below do not free memory. We simply exit on failure. */
Shouldn't you move this comment one more function above (above copy_hw_resource_range
)?
Also, you have a double-space.
Pal/src/host/Linux-SGX/db_main.c, line 193 at r4 (raw file):
int64_t start = (int64_t)res_info.ranges[i].start; if (res_info.ranges[i].end > INT64_MAX)
I would merge these two IFs into one, will look nicer.
Pal/src/host/Linux-SGX/db_main.c, line 339 at r4 (raw file):
int64_t num_sockets = (int64_t)topo_info.num_sockets; PAL_RES_RANGE_INFO * socket_info =
PAL_RES_RANGE_INFO *
-> PAL_RES_RANGE_INFO*
Pal/src/host/Linux-SGX/db_main.c, line 340 at r4 (raw file):
int64_t num_sockets = (int64_t)topo_info.num_sockets; PAL_RES_RANGE_INFO * socket_info = (PAL_RES_RANGE_INFO*)calloc(num_sockets, sizeof(PAL_RES_RANGE_INFO ));
You have a problem with spaces :) Please remove one here
Pal/src/host/Linux-SGX/db_main.c, line 435 at r4 (raw file):
uint64_t end = numa_topology[idx].cpumap.ranges[i].end; for (int64_t j = (int64_t)start; j <= (int64_t)end; j++) { int64_t index = j / (sizeof(int) * BITS_IN_BYTE);
Use BITS_IN_TYPE
Pal/src/host/Linux-SGX/db_main.c, line 440 at r4 (raw file):
goto out_numa; } if (bitmap[index] >> (j % (sizeof(int) * BITS_IN_BYTE)) == 1) {
Use BITS_IN_TYPE
Pal/src/host/Linux-SGX/db_main.c, line 445 at r4 (raw file):
goto out_numa; } bitmap[index] |= 1U << (j % (sizeof(int) * BITS_IN_BYTE));
Use BITS_IN_TYPE
Pal/src/host/Linux-SGX/db_misc.c, line 804 at r3 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Please see the earlier comment on this.
Wait, but this function is different, right? It is the PAL (internal) interface, so it must return PAL error codes, not just -1
. See for example _DkGetCPUInfo()
above.
I am fine with returning -1
in static internal funcs for sanitization, but definitely not in _DkXXX
functions.
Pal/src/host/Linux-SGX/db_misc.c, line 811 at r3 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Please see the earlier comment on this.
ditto
Pal/src/host/Linux-SGX/db_misc.c, line 811 at r3 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
I have added a comment at top of sanitization functions.
This comment is in another file. Can you add the comment to this file as well?
Pal/src/host/Linux-SGX/db_misc.c, line 817 at r3 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Please see the earlier comment on this.
ditto
8f7fa8f
to
15962c0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 31 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @vijaydhanraj)
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Ah, I see. Just call this commit
[Pal] Add `realloc_size()` in PAL slab manager
OK, done.
LibOS/shim/include/shim_fs_pseudo.h, line 212 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Yes. The emphasis is though on the word
struct
not on the wordtypedef
.E.g.,
typedef uint8_t char
is perfectly fine. Buttypedef struct { ... }
-- here devs expect that this is passed as a pointer to functions.
Got it, thanks for the explanation.
LibOS/shim/include/shim_fs_pseudo.h, line 214 at r4 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Please replace
max_len
withsize
(ormax_size
).You're passing not string lengths but string sizes in all call sites (e.g., you pass
sizeof(str)
). The namelen
confused me.
changed max_len
-> max_size
LibOS/shim/include/shim_fs_pseudo.h, line 218 at r4 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Please replace
max_len
withsize
(ormax_size
).You're passing not string lengths but string sizes in all call sites (e.g., you pass
sizeof(str)
). The namelen
confused me.
Done.
LibOS/shim/include/shim_fs_pseudo.h, line 224 at r4 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Please replace
max_len
withsize
(ormax_size
).You're passing not string lengths but string sizes in all call sites (e.g., you pass
sizeof(str)
). The namelen
confused me.
Done.
LibOS/shim/src/fs/sys/fs.c, line 91 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Well, true. But you have
BITS_IN_TYPE
, you can use it here.
oh right, yes done.
LibOS/shim/src/fs/sys/fs.c, line 96 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
ditto (use
BITS_IN_TYPE
)
Done.
LibOS/shim/src/fs/sys/fs.c, line 19 at r4 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
This is useless now, because
max_len
is of unsigned type. Please remove.
Done.
LibOS/shim/src/fs/sys/fs.c, line 46 at r4 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
It should be
offset >= max_len
. Otherwise your loop will become infinite.
Sorry, how do we run into an infinite loop? The loop only runs range_cnt
times which is unmodified.
Also, what is the use of offset == max_len
condition? It is going to be a no-op right?
LibOS/shim/src/fs/sys/fs.c, line 70 at r4 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
-EINVAL
Done.
LibOS/shim/src/fs/sys/fs.c, line 81 at r4 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
You leak
bitmap
here. Simply move this check above thecalloc
code.
yes true, done.
LibOS/shim/src/fs/sys/fs.c, line 87 at r4 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
You leak
bitmap
here
Done.
LibOS/shim/src/fs/sys/fs.c, line 101 at r4 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
It should be
offset >= max_len
. Otherwise your loop will become infinite.
Here the loop only runs until 0
- num_cpumask-1
. Please clarify.
LibOS/shim/src/fs/sys/fs.c, line 105 at r4 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
You re-define
ret
here. This will lead to undefined behavior because the outer-scoperet
(which is returned in theout
label) will not be initialized...Please remove the
int
word basically.
Done.
LibOS/shim/src/fs/sys/node_info.c, line 27 at r4 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
This looks ugly... Why not just:
if (strcmp(name, "online") != 0) { log_debug("unrecognized file: %s", name); return -ENOENT; } ret = sys_convert_ranges_to_str(&g_pal_control->topo_info.nodes, str, sizeof(str), ","); if (ret < 0) return ret; return sys_load(str, out_data, out_size);
OK, done.
LibOS/shim/src/fs/sys/node_info.c, line 54 at r4 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Where is
ret = -ENOENT
in this case?
ah missed it, done.
Pal/src/host/Linux-SGX/db_main.c, line 178 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Not done?
Done now :)
Pal/src/host/Linux-SGX/db_main.c, line 256 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Almost, forgot to add a space (
=1
->= 1
)
Done.
Pal/src/host/Linux-SGX/db_main.c, line 412 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
That's a typical way of writing such code snippets. It conveys that "If I got to this point in the function, I consider it a success".
Your current way is syntactically equivalent but conveys a more contrived logic: "I first assume that the function will end successfully, and if something fails, I will immediately bail out; so if I got to this point in the function, I know that my initial assumption indeed was correct".
In other words, it's just that it's easier to read.
OK, done.
Pal/src/host/Linux-SGX/db_main.c, line 414 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Not completely done.
i = 0 ;
->i = 0;
Done.
Pal/src/host/Linux-SGX/db_main.c, line 566 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
You can remove this comment here now.
Done.
Pal/src/host/Linux-SGX/db_main.c, line 157 at r4 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Shouldn't you move this comment one more function above (above
copy_hw_resource_range
)?Also, you have a double-space.
Done.
Pal/src/host/Linux-SGX/db_main.c, line 193 at r4 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
I would merge these two IFs into one, will look nicer.
Done.
Pal/src/host/Linux-SGX/db_main.c, line 339 at r4 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
PAL_RES_RANGE_INFO *
->PAL_RES_RANGE_INFO*
Done.
Pal/src/host/Linux-SGX/db_main.c, line 340 at r4 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
You have a problem with spaces :) Please remove one here
Done :)
Pal/src/host/Linux-SGX/db_main.c, line 435 at r4 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Use
BITS_IN_TYPE
Done.
Pal/src/host/Linux-SGX/db_main.c, line 440 at r4 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Use
BITS_IN_TYPE
Done.
Pal/src/host/Linux-SGX/db_main.c, line 445 at r4 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Use
BITS_IN_TYPE
Done.
Pal/src/host/Linux-SGX/db_misc.c, line 804 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Wait, but this function is different, right? It is the PAL (internal) interface, so it must return PAL error codes, not just
-1
. See for example_DkGetCPUInfo()
above.I am fine with returning
-1
in static internal funcs for sanitization, but definitely not in_DkXXX
functions.
ah I missed this. But I agree for PAL interface we should return PAL error instead of -1
Pal/src/host/Linux-SGX/db_misc.c, line 811 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
ditto
Done.
Pal/src/host/Linux-SGX/db_misc.c, line 811 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
This comment is in another file. Can you add the comment to this file as well?
Added this /* This function doesn't clean up resources on failure, as we terminate on failure. */
comment here.
Pal/src/host/Linux-SGX/db_misc.c, line 817 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
ditto
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 5 of 5 files at r5, all commit messages.
Reviewable status: all files reviewed, 7 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @vijaydhanraj)
LibOS/shim/src/fs/sys/fs.c, line 46 at r4 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Sorry, how do we run into an infinite loop? The loop only runs
range_cnt
times which is unmodified.Also, what is the use of
offset == max_len
condition? It is going to be a no-op right?
Oops, my bad, I misread the loop condition.
LibOS/shim/src/fs/sys/fs.c, line 101 at r4 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Here the loop only runs until
0
-num_cpumask-1
. Please clarify.
Ignore me.
LibOS/shim/src/fs/sys/fs.c, line 101 at r5 (raw file):
/* Convert cpumask to strings */ size_t offset = 0; for (int64_t j = num_cpumask - 1; j >= 0; j-- ) {
j-- )
-> j--)
LibOS/shim/src/fs/sys/fs.c, line 110 at r5 (raw file):
(bitmap[j] & 0xf00000) >> 20, (bitmap[j] & 0xf0000) >> 16, (bitmap[j] & 0xf000) >> 12, (bitmap[j] & 0xf00) >> 8, (bitmap[j] & 0xf0) >> 4, (bitmap[j] & 0xf), (j == 0) ? "\0" : ",");
Actually, why so weird? Why do you need to print hex-digit by hex-digit?
%x
qualifier takes unsigned int
which is exactly the type of your bitmap[j]
(32 bits). So you could simply do:
snprintf(str + offset, max_size - offset, "%08x%s", bitmap[j], (j == 0) ? "\0" : ",");
Or did I miss something? My version seems correct.
Pal/src/host/Linux-common/topo_info.c, line 109 at r5 (raw file):
res_info->range_count++; /* Realloc as we identify new range when parsing */
I'm confused again. Why can't you use realloc_size()
here?
Are we running in circles? I remember asking you to remove realloc_size()
and just replace it with this explicit logic everywhere instead. But then I guess you found other places where you wanted to use realloc_size()
so you re-introduced it again in those places. But you didn't re-introduce realloc_size()
here?
I mean, if we're using realloc_size()
, let's use it everywhere we can. If we don't use realloc_size()
, then we should remove it from everywhere.
Pal/src/host/Linux-SGX/db_main.c, line 192 at r5 (raw file):
return -1; int64_t start = (int64_t)res_info.ranges[i].start;
I think I already asked but... what's the deal with int64_t
vs uint64_t
? Why can't all variables be of type uint64_t
? Then you won't have these ugly explicit casts and you won't need to do all these checks blabla > INT64_MAX
.
Pal/src/host/Linux-SGX/db_main.c, line 229 at r5 (raw file):
max_limit = 2; /* Taking HT into account */ } else { /* if unified cache then it can range upto total number of cores. */
upto
-> up to
Or is it correct English? Dictionary tells me upto
is not a word.
Pal/src/host/Linux-SGX/db_main.c, line 332 at r5 (raw file):
} static int sanitize_socket_info(PAL_TOPO_INFO topo_info) {
Just noticed. Why is topo_info
passed by value? Or am I missing something?
Pal/src/host/Linux-SGX/db_main.c, line 439 at r5 (raw file):
goto out_numa; } if (bitmap[index] >> (j % BITS_IN_TYPE(int)) == 1) {
Wait, is this check correct? You just want to check if bit j % BITS_IN_TYPE(int)
is set in bitmap[index]
. But your code checks something else: whether from the upper bits only bit j % BITS_IN_TYPE(int)
is set. So your check is wrong.
Anyway, the classic way to check if the bit is set: bitmap[index] & 1U << (j % BITS_IN_TYPE(int))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 18 of 20 files reviewed, 7 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv)
LibOS/shim/src/fs/sys/fs.c, line 101 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
j-- )
->j--)
Done.
LibOS/shim/src/fs/sys/fs.c, line 110 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Actually, why so weird? Why do you need to print hex-digit by hex-digit?
%x
qualifier takesunsigned int
which is exactly the type of yourbitmap[j]
(32 bits). So you could simply do:snprintf(str + offset, max_size - offset, "%08x%s", bitmap[j], (j == 0) ? "\0" : ",");
Or did I miss something? My version seems correct.
I am not able to think why I had this logic. So, let us use your suggested approach.
Pal/src/host/Linux-common/topo_info.c, line 109 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
I'm confused again. Why can't you use
realloc_size()
here?Are we running in circles? I remember asking you to remove
realloc_size()
and just replace it with this explicit logic everywhere instead. But then I guess you found other places where you wanted to userealloc_size()
so you re-introduced it again in those places. But you didn't re-introducerealloc_size()
here?I mean, if we're using
realloc_size()
, let's use it everywhere we can. If we don't userealloc_size()
, then we should remove it from everywhere.
I did try this, and it was giving me a linker error. Based on our offline chat it seemed like we needed to modify sources and makefile which was not desired. So ended up using explicit logic instead of realloc_size()
.
Pal/src/host/Linux-SGX/db_main.c, line 192 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
I think I already asked but... what's the deal with
int64_t
vsuint64_t
? Why can't all variables be of typeuint64_t
? Then you won't have these ugly explicit casts and you won't need to do all these checksblabla > INT64_MAX
.
If you see here, https://github.com/gramineproject/graphene/pull/2661/files#diff-cc6f581d340e5e5a512e7693bb560ee623a287dab6d3d24c47410912a2ddd305R207 we compare with end
with previous_end
which is initialized to -1
.
Again, in couple of places we check against < 0
like here, https://github.com/gramineproject/graphene/pull/2661/files#diff-cc6f581d340e5e5a512e7693bb560ee623a287dab6d3d24c47410912a2ddd305R234
So in order to maintain consistency, I used int64_t
everywhere. I think we had a similar situation last time when merging sysfs changes and the consensus was to keep things uniform.
Pal/src/host/Linux-SGX/db_main.c, line 229 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
upto
->up to
Or is it correct English? Dictionary tells me
upto
is not a word.
Done.
Pal/src/host/Linux-SGX/db_main.c, line 332 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Just noticed. Why is
topo_info
passed by value? Or am I missing something?
Changed this and few other functions as well to pass by reference where typedef struct
was used.
Pal/src/host/Linux-SGX/db_main.c, line 439 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Wait, is this check correct? You just want to check if bit
j % BITS_IN_TYPE(int)
is set inbitmap[index]
. But your code checks something else: whether from the upper bits only bitj % BITS_IN_TYPE(int)
is set. So your check is wrong.Anyway, the classic way to check if the bit is set:
bitmap[index] & 1U << (j % BITS_IN_TYPE(int))
Good catch. Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 11 files at r3, 1 of 5 files at r5, all commit messages.
Reviewable status: 18 of 20 files reviewed, 17 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @vijaydhanraj)
-- commits, line 3 at r6:
missing verb
-- commits, line 14 at r6:
Please align this to the "R" above (ditto below)
LibOS/shim/include/shim_fs_pseudo.h, line 214 at r6 (raw file):
This isn't just a traditional "int to string" as the name suggests. Should be named e.g. like this: sys_convert_int_to_sizestr
.
If size multiplier like 'K', 'M' or 'G' is passed
How can a character be passed in uint64_t
argument? If you want to pass a character then it should be char
.
If size multiplier [...] is passed
How can it not be passed? C doesn't have optional arguments...
This whole signature and explanation is very confusing to me (at this point I haven't read the implementation yet, but the header should suffice for the function users).
LibOS/shim/include/shim_fs_pseudo.h, line 216 at r6 (raw file):
ranges[0].start and ranges[0].end
This is pseudocode, right? There's no ranges
parameter in this function.
LibOS/shim/include/shim_fs_pseudo.h, line 217 at r6 (raw file):
then `0-63` string is generated
umm, but it's res_range_info
which has "res" in its name, I'd assume that this is the returned result of the function and the string is its input? But the function name and the comment say otherwise.
LibOS/shim/include/shim_fs_pseudo.h, line 222 at r6 (raw file):
/* Converts array of integer range(s) to a cpu bitmap string. For example, if ranges[0].start and * ranges[0].end were 0 and 31 respectively, then `00000000,ffffffff` string is generated. */
ditto
LibOS/shim/include/shim_fs_pseudo.h, line 225 at r6 (raw file):
int sys_convert_ranges_to_cpu_bitmap_str(const PAL_RES_RANGE_INFO* res_range_info, char* str, size_t max_size);
Overall: these declarations are super confusing, I just read them and have no idea what would actually happen if I called these functions in my code.
LibOS/shim/src/fs/sys/cache_info.c, line 34 at r6 (raw file):
MULTIPLIER_NONE
Ugh, where did you get that constant from? The header for this function doesn't mention it at all, the type of this argument is uint64_t
(indicating that maybe I should pass 1
/1000
/1000000
?), the description says that I should pass 'K'
/'M'
/'G'
or not pass it at all (?), and here you're passing some enum?
LibOS/shim/src/fs/sys/cache_info.c, line 34 at r6 (raw file):
ret = sys_convert_ranges_to_cpu_bitmap_str(&cache->shared_cpu_map, str, sizeof(str)); } else if (strcmp(name, "level") == 0) { ret = sys_convert_int_to_str(cache->level, MULTIPLIER_NONE, str, sizeof(str));
Why does this multiplier even exists? Wasn't the idea that what host passes should be as simple and normalized as possible? Why not just pass the number of bytes everywhere and just in that int->str function shorten it if it's divisible by some size unit? (for specific rounding rules please check the kernel source)
With current version you have a problem that you may pass to the app things which normal kernel would never pass - e.g. "1024K" instead of "1M", or even things like "1000000000K".
And another problem: what if the untrusted host gives you LONG_MAX
gigabytes? I think in some places you'll just pass it to the app without erroring out, because you never needed to multiply it, so there was no chance for overflow detection (which will certainly happen in the user app then).
LibOS/shim/src/fs/sys/fs.c, line 70 at r6 (raw file):
if (g_pal_control->topo_info.possible_logical_cores.resource_count > INT64_MAX) return -EINVAL; int ret;
this is declared in a completely random place
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r6, all commit messages.
Reviewable status: all files reviewed, 13 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @mkow and @vijaydhanraj)
LibOS/shim/src/fs/sys/fs.c, line 70 at r6 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
this is declared in a completely random place
To expand on @mkow's comment: we (almost always) put int ret;
as the very first statement of the function.
Pal/src/host/Linux-common/topo_info.c, line 109 at r5 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
I did try this, and it was giving me a linker error. Based on our offline chat it seemed like we needed to modify sources and makefile which was not desired. So ended up using explicit logic instead of
realloc_size()
.
But in the other place you use realloc_size()
. Now we have a strange state that some PAL sources use realloc_size
and some sources don't use it (and instead do the malloc-memcpy-free
pattern explicitly).
Sorry to go back and forth, but let's remove your newly-introduced realloc_size
function and replace the only caller of realloc_size
with the explicit malloc-memcpy-free
pattern. It doesn't make sense to introduce a function for only one place in the code and (for some obscure reason) to not use this function in other places.
This means that you should drop the commit [Pal] Add
realloc_size() in PAL slab manager
.
Pal/src/host/Linux-SGX/db_main.c, line 192 at r5 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
If you see here, https://github.com/gramineproject/graphene/pull/2661/files#diff-cc6f581d340e5e5a512e7693bb560ee623a287dab6d3d24c47410912a2ddd305R207 we compare with
end
withprevious_end
which is initialized to-1
.Again, in couple of places we check against
< 0
like here, https://github.com/gramineproject/graphene/pull/2661/files#diff-cc6f581d340e5e5a512e7693bb560ee623a287dab6d3d24c47410912a2ddd305R234So in order to maintain consistency, I used
int64_t
everywhere. I think we had a similar situation last time when merging sysfs changes and the consensus was to keep things uniform.
Well, previous_end
could be initialized to 0
, and everything would still work.
I still don't get why we have int64_t
. There must have been a better/more sophisticated reason why we introduced these int64_t
types in the first place. Or maybe there is no reason at all? Then we should just go back to uint64_t
, the code will become much simpler.
We already store all CPU/NUMA data as PAL_NUM
which is uint64_t
. There seems to be no reason to use int64_t
and do all these checks like res_info->range_count > INT64_MAX
...
Pal/src/host/Linux-SGX/db_main.c, line 360 at r6 (raw file):
size_t new_sz = sizeof(PAL_RANGE_INFO) * socket_info[socket].range_count; size_t old_sz = new_sz - sizeof(PAL_RANGE_INFO); socket_info[socket].ranges = realloc_size(socket_info[socket].ranges, old_sz, new_sz);
Please replace this call with an explicit malloc-memcpy-free
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 13 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @vijaydhanraj)
LibOS/shim/src/fs/sys/fs.c, line 70 at r6 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
To expand on @mkow's comment: we (almost always) put
int ret;
as the very first statement of the function.
Not even this, I would be also fine if it was declared right before the first usage. But here it's declared in a middle of unrelated code...
Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com>
1. Refactor sysfs code to convert CPU/NUMA information from Linux-formatted strings to integers in untrusted PAL. 2. Sanitize CPU/NUMA information based on untrusted integers. 3. Convert the trusted CPU/NUMA information from integers back to Linux-formatted strings in LibOS. 4. Consolidate all topology related information from `PAL_CPU_INFO` struct into `PAL_TOPO_INFO` struct. Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com>
This commit adds `number_of_sets` path to sysfs cache info that was missed while refactoring sysfs. Also, updated the test script to validate this path. Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com>
48aa4d7
to
44c6bee
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 12 of 20 files reviewed, 13 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @dimakuv and @mkow)
Previously, mkow (Michał Kowalczyk) wrote…
missing verb
Rephrased as Add helper macro to convert bits to integer size
Previously, mkow (Michał Kowalczyk) wrote…
Please align this to the "R" above (ditto below)
Done.
LibOS/shim/include/shim_fs_pseudo.h, line 214 at r6 (raw file):
Renamed from sys_convert_int_to_str
to sys_convert_int_to_sizestr
How can a character be passed in uint64_t argument?
I am not passing a character but an enum
value here.
How can it not be passed?
My wording might be misleading. What I meant here by "not passing" is that we simply pass a default multiplier like MULTIPLIER_NONE
instead of the actual size qualifier
I have reworded the text as below. Let me know if this makes it clearer.
/* Converts integer to a string. When integer representation (enum value) of size multiplier like
*'K', 'M' or 'G' is passed, it is converted back to string and appended. When the size multiplier
* is `MULTIPLIER_NONE`, nothing is appended, and the integer is simply converted to a string. */
LibOS/shim/include/shim_fs_pseudo.h, line 216 at r6 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
ranges[0].start and ranges[0].end
This is pseudocode, right? There's no
ranges
parameter in this function.
I have added the parameter that has ranges
in the comment to make it clear.
LibOS/shim/include/shim_fs_pseudo.h, line 217 at r6 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
then `0-63` string is generated
umm, but it's
res_range_info
which has "res" in its name, I'd assume that this is the returned result of the function and the string is its input? But the function name and the comment say otherwise.
As the function name indicates we want to convert the integer input (in this case range info) that is part of res_range_info
to an output string that is stored in str
.
res
in this context is for resource
and not result
LibOS/shim/include/shim_fs_pseudo.h, line 222 at r6 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
ditto
Please refer to the earlier comment.
LibOS/shim/include/shim_fs_pseudo.h, line 225 at r6 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
Overall: these declarations are super confusing, I just read them and have no idea what would actually happen if I called these functions in my code.
I have reworded the comments a bit to clarify. Please see it helps. Suggestions are definitely welcome.
LibOS/shim/src/fs/sys/cache_info.c, line 34 at r6 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
MULTIPLIER_NONE
Ugh, where did you get that constant from? The header for this function doesn't mention it at all, the type of this argument is
uint64_t
(indicating that maybe I should pass1
/1000
/1000000
?), the description says that I should pass'K'
/'M'
/'G'
or not pass it at all (?), and here you're passing some enum?
I have this enum
defined in pal-arch.h
. pal.h
includes this and it is in turn included in shim_fs.h
I have reworded the comments on this function, please take a look and let me know.
LibOS/shim/src/fs/sys/cache_info.c, line 34 at r6 (raw file):
Why does this multiplier even exists?
We have a multiplier because we exactly need to know how the normal kernel is representing the strings. For example, on reading/sys/devices/system/cpu/cpu2/cache/index3/size we get
49152K. Now this can be represented as
48Mor
49152K` and in this case, kernel chose the latter representation, and we need to give the same to the end-user.
With current version you have a problem that you may pass to the app things which normal kernel would never pass - e.g. "1024K" instead of "1M", or even things like "1000000000K".
Please see the earlier comment.
And another problem: what if the untrusted host gives you LONG_MAX gigabytes?
Please see here (lines 263-267
) https://github.com/gramineproject/graphene/pull/2661/files#diff-cc6f581d340e5e5a512e7693bb560ee623a287dab6d3d24c47410912a2ddd305R263, I do multiply the size qualifier and check the limits when sanitizing so we shouldn't be running into overflow issues in this case. If you see the issue somewhere, please point me to it.
LibOS/shim/src/fs/sys/fs.c, line 70 at r6 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
Not even this, I would be also fine if it was declared right before the first usage. But here it's declared in a middle of unrelated code...
I might have removed some code and missed removing/re-arranging the ret
variable. But fixed this now.
Pal/src/host/Linux-common/topo_info.c, line 109 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
But in the other place you use
realloc_size()
. Now we have a strange state that some PAL sources userealloc_size
and some sources don't use it (and instead do themalloc-memcpy-free
pattern explicitly).Sorry to go back and forth, but let's remove your newly-introduced
realloc_size
function and replace the only caller ofrealloc_size
with the explicitmalloc-memcpy-free
pattern. It doesn't make sense to introduce a function for only one place in the code and (for some obscure reason) to not use this function in other places.This means that you should drop the commit
[Pal] Add
realloc_size()in PAL slab manager
.
I have dropped the commit [Pal] Add realloc_size() in PAL slab manager
and have followed malloc-memcpy-free
pattern.
Pal/src/host/Linux-SGX/db_main.c, line 192 at r5 (raw file):
Well, previous_end could be initialized to 0, and everything would still work.
Let us say we have a range like 0, 3-4, 31
. Here start = end = 0
. Now, if we have previous_end
initialized to 0
it will fail our check previous_end >= end
.
Even I am not sure why we had int64_t
earlier but agree that we can avoid int64_t
and the corresponding checks. I have worked around this and have converted int64_t
to uint64_t
.
Pal/src/host/Linux-SGX/db_main.c, line 360 at r6 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Please replace this call with an explicit
malloc-memcpy-free
.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed the review comments and pushed the changes. Also, as discussed ported this to gramine repository, gramineproject/gramine#106. Please let me know if I should close this one.
Reviewable status: 12 of 20 files reviewed, 13 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @dimakuv and @mkow)
Description of the changes
This PR refactors sysfs to improve sanitization. It fixes issues 1, 2 listed as part of #2105.
How to test this PR?
Please run
sysfs_common
andproc_common
regression tests.This change is![Reviewable](https://camo.githubusercontent.com/1541c4039185914e83657d3683ec25920c672c6c5c7ab4240ee7bff601adec0b/68747470733a2f2f72657669657761626c652e696f2f7265766965775f627574746f6e2e737667)