Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

change default option and format #43

Merged
merged 10 commits into from
Mar 16, 2023
Merged

change default option and format #43

merged 10 commits into from
Mar 16, 2023

Conversation

mawc2019
Copy link
Collaborator

@mawc2019 mawc2019 commented Feb 15, 2023

In this PR, according to @stevengj 's suggestion, the default option in ruler.py is changed back to open minus close. This default approach generates the smaller minimum length scale between solid and void regions. The number could be slightly different from calculating minimum length scales for solid and void regions separately and then taking the minimum.

According to @ianwilliamson 's suggestion, the output is always a two-element tuple. When a user requires only one number, the second element is None. The code in ruler.py has been formatted by yapf.

@oskooi suggested to show various designs in README.md. It was done in a previous PR. Do you think those were too many? Shall we move all of them to example.ipynb and provide a link in README.md?

The python test file will be uploaded later. After everything is done, shall we move it to the NanoComp site? In addition, the name ruler is quite arbitrary. Shall we rename it by extracting some letters from a descriptive phrase, e.g., miles from minimum length scale?

ruler/ruler.py Outdated
arr: Iterable[np.ndarray],
phys_size: Iterable[np.ndarray],
margin_size: Iterable[np.ndarray] = np.zeros((2, 2)),
measure_region: str = 'min',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest making this an enum instead of a string, which will alleviate validation and the need to enumerate all the variations above. Similar suggestion for the pad_mode parameter

ruler/ruler.py Outdated

def minimum_length(
arr: Iterable[np.ndarray],
phys_size: Iterable[np.ndarray],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense for these to be np.ndarrays (which, in general, can have any number of dimensions / shape)? If I understand these parameters correctly, phys_size specifies the extent in physical units of the design arr along each of its axes? If so, could we just type this as Tuple[float, ...] or List[float] and validate that len(phys_size) == arr.ndim? IMO this would be more straightforward.

Similar comment for margin_size. If I understand correctly, it is meant to be interpreted similarly to a padding, but is instead a sort of de-padding. I would suggest a type similar to the pad_width in numpy's pad operation like List[Tuple[float, float]] where the quantities are [(before_1, after_1), ... (before_N, after_N)] for each axis.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense for these to be np.ndarrays (which, in general, can have any number of dimensions / shape)?

I think it makes sense for the input design pattern arr to be np.ndarray, but phys_size does not need to be np.ndarray. The code can accept phys_size as an array, a list, a tuple, or simply a number (for 1d design pattern).

If I understand these parameters correctly, phys_size specifies the extent in physical units of the design arr along each of its axes?

Yes, this is correct.

If so, could we just type this as Tuple[float, ...] or List[float] and validate that len(phys_size) == arr.ndim?

The argument phys_size can have different types as mentioned above. If we have to choose one of the types for type hints, why is Tuple or List preferred over Iterable[np.ndarray] (apart from appearing more straightforward)?

Similar comment for margin_size. If I understand correctly, it is meant to be interpreted similarly to a padding, but is instead a sort of de-padding.

Yes, this is correct.

I would suggest a type similar to the pad_width in numpy's pad operation like List[Tuple[float, float]] where the quantities are [(before_1, after_1), ... (before_N, after_N)] for each axis.

This will be adopted in the next version.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code can accept phys_size as an array, a list, a tuple, or simply a number (for 1d design pattern).

Supporting so many possibilities places a much higher burden on validating for correctness of the inputs. In particular, np.ndarray is a completely unconstrained type and seems quite awkward (IMO) for specifying the extent and border "throw away" region.

Constraining the API for phys_size and margin_size to be either a tuple or a list whose length matches arr.ndim seems reasonable and has precident in various numpy APIs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that we should keep things as simple as possible. To that end, it may be worthwhile removing the phys_size here and simply compute the minimum feature size in units of pixels. A separate API with a physical size could then be created by wrapping this simpler function.

Copy link
Collaborator Author

@mawc2019 mawc2019 Feb 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To that end, it may be worthwhile removing the phys_size here and simply compute the minimum feature size in units of pixels. A separate API with a physical size could then be created by wrapping this simpler function.

The resolution along different dimensions may not always be the same. For example, in a 2d design pattern, the resolution along the x direction may not be the same as the resolution along the y axis, although we have not encountered this situation in this project. Such anisotropy makes it infeasible to decouple the two steps. We can only decouple them safely for 1d design patterns, and in ruler.py it was done in this way.

If so, could we just type this as Tuple[float, ...] or List[float] and validate that len(phys_size) == arr.ndim?

Sometimes phys_size and arr provided by users may have redundant dimensions, and phys_size may contain zeros. In _ruler_initialize, zeros in phys_size are removed and arr is squeezed, upon which the dimension check is performed. To remove zeros from phys_size, my approach is to convert it to an array and then apply .nonzero() to obtain the indices of nonzeros elements.

ruler/ruler.py Outdated
def minimum_length(arr,phys_size,margin_size=np.zeros((2,2)),measure_region='solid',pad_mode=None):
Compute the minimum length scale in a design pattern.

arr: input array that represents a design pattern.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These first three parameters are typed as Iterable and it's not clear from these docstrings how that is supposed to be interpreted. Is one supposed to provide a list of designs and the function will return a list of measured feature sizes (per provided design)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is one supposed to provide a list of designs and the function will return a list of measured feature sizes (per provided design)?

No, the user is supposed to provide one design every time instead of a list of designs. As mentioned in the annotation, arr: input array that represents a design pattern. Do you think it is still not clear?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then Iterable[np.ndarray] is not the correct type annotation. It should be np.ndarray.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Thanks!

@mawc2019
Copy link
Collaborator Author

mawc2019 commented Feb 18, 2023

Here is a little complicated situation, which is taken care of in this PR. But I am not sure whether we should deal with this situation.

As shown in the example below, if a design pattern contains a structure near boundaries and the user does not specify any pad mode (pad_mode=None), the minimum length scale can be either the size of the gap (marked by the red bar) or the diameter of the disc.
image

This PR introduces an argument with_boundary, which is only meaningful when pad_mode=None. If True, the gap formed with the boundary is considered, and the minimum length scale is the size of the gap in this example; if False, any gap formed with boundaries is never regarded as a minimum length scale. The default option is False.

Relevant issues were discussed here and here.

@mfschubert
Copy link
Contributor

mfschubert commented Feb 18, 2023

In this design, it seems natural that the minimum length scale is the diameter of the circle. This effectively assumes that there is more void to the right of the image, which seems better justified than the alternative, i.e. assuming that there is solid to the right of the image. To me, it seems like edge padding is generally the most reasonable choice.

Regarding the with_boundary and pad_mode options, I would actually be in favor of removing these, and just always report the length scale for an edge-padded image. A user can always provided an alternately padded or trimmed design should they desire, but in my opinion this would generally need to be well justified and it's fine to make the user do this manually.

(edit) Actually, upon reflection I think I see the utility of trimming the border -- for example, when a feature cuts diagonally across the design region. In this case, edge padding could create sharp 45 degree corners which have small minimum length scales. Let me try to run some tests with my version of the ruler.

@mawc2019
Copy link
Collaborator Author

mawc2019 commented Feb 18, 2023

To me, it seems like edge padding is generally the most reasonable choice.

In ruler.py, the argument pad_mode has four options, i.e., 'solid', 'void', 'edge', and None. I agree that the edge mode is the most reasonable choice.

Regarding the with_boundary and pad_mode options, I would actually be in favor of removing these, and just always report the length scale for an edge-padded image. A user can always provided an alternately padded or trimmed design should they desire, but in my opinion this would generally need to be well justified and it's fine to make the user do this manually.

Yes, users can always do this manually. The padding option was added to ruler.py very recently. Can we finalize this option, @stevengj ?

Actually, upon reflection I think I see the utility of trimming the border -- for example, when a feature cuts diagonally across the design region. In this case, edge padding could create sharp 45 degree corners which have small minimum length scales.

Yes. Two examples are here.

@mfschubert
Copy link
Contributor

Yes. Two examples are here.

OK, I just ran some tests with both the ruler and metrics implementations, and both actually seem to work fine on the diagonal line problem (see this colab). Is there a known-problematic example we could test?

@mawc2019
Copy link
Collaborator Author

both actually seem to work fine on the diagonal line problem

In a diagonal line problem, if the border is not trimmed, the sharp corners should have a length scale, which is much larger than the size of one pixel. In your example, I guess the length scale of sharp corners is larger than the width of the strip. Consequently, those sharp corners do not affect the results.

Is there a known-problematic example we could test?

Here is a similar example that causes incorrect estimation. The width of the strip is 200, which is also the declared minimum length scale if the void gaps are ignored. However, the minimum length scales estimated by metrics.py and ruler.py are 40 and 75, respectively.
image

The code is as follows. (I got errors when I tried to rerun the notebook in your link.)

i, j = np.meshgrid(np.arange(240), np.arange(240))
angle = np.radians(5)
off_set = 60
design = ((i*np.sin(angle) + j*np.cos(angle) - 300+off_set) < 0) & ((i*np.sin(angle) + j*np.cos(angle) - 100+off_set) > 0)

mls_metrics = metrics.minimum_length_scale(design)
mls_ruler = ruler.minimum_length(
    design.astype(float),
    phys_size=design.shape,
    margin_size=((0, 0), (0, 0)),
    pad_mode="edge",
)

plt.figure(figsize=(10, 10))
plt.title(f"mls_metrics={mls_metrics}\nmls_ruler={mls_ruler}")
plt.imshow(design)

@mfschubert
Copy link
Contributor

mfschubert commented Feb 18, 2023

Here is a similar example that causes incorrect estimation. The width of the strip is 200, which is also the declared minimum length scale if the void gaps are ignored. However, the minimum length scales estimated by metrics.py and ruler.py are 40 and 75, respectively.

Ah yes, thanks for this example. It does motivate some flexibility to ignore the borders of the design, although I would still question whether we need the flexibility of arbitrary trimming along each edge. In any case, this motivates some changes to my own ruler implementation, so let me work on that and I might have a proposal thereafter.

@mawc2019
Copy link
Collaborator Author

To me, it seems like edge padding is generally the most reasonable choice.

In ruler.py, the argument pad_mode has four options, i.e., 'solid', 'void', 'edge', and None. I agree that the edge mode is the most reasonable choice.

If we use the edge mode, what would be the minimum length scale of the void region in this pattern? When the edge mode is used, the result given by ruler.py is the short side of the design pattern, which may not make sense to everyboby.



class TestRuler(unittest.TestCase):
def setUp(self):
Copy link
Collaborator

@ianwilliamson ianwilliamson Feb 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no logic or parameterization being done in this setUp() routine. Therefore, it is better to set these constants at the module level, i.e.

RESOLUTION = 1
PHYS_SIZE = (200, 200)
# ...

above the class definition.

ruler/ruler.py Outdated
def void_minimum_length(
arr: np.ndarray,
phys_size: Tuple[float, ...],
margin_size: List[Tuple[float, float]] = ((0, 0), (0, 0))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you would actually want [(0, 0), (0, 0)], since your typing indicates a list of tuples. OTOH this is probably not a reasonable default value since it is fixed to two dimensions and arr could be 1, 2, or 3 dimensions. If you want the default behavior to be 0 margin on top and bottom of all dims of arr, I suggest typing this as Optional[ ... ], setting the default to None, and handling the creation of an appropriate zero-valued margin_size for whatever dimension of arr is passed inside the function.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you would actually want [(0, 0), (0, 0)], since your typing indicates a list of tuples.

I think something like ((0,0),(0,0)) may sound better because it would follow the same syntax as pad_width in numpy.pad, as you suggested before.

I suggest typing this as Optional[ ... ], setting the default to None

In the next version, this line will be margin_size: Optional[Tuple[Tuple[float, float],...]] = None.

@ianwilliamson
Copy link
Collaborator

One other high-level comment is that we should document what each function returns in its docstring.

I won't require that we adhere to a particular formatting style, but my bias would be to suggest the Google style, which is consistent with yapf formatting. Even if we don't strictly adopt the style, please see its section on function docstring formatting for guidance on describing what gets returned: https://google.github.io/styleguide/pyguide.html#383-functions-and-methods

ruler/ruler.py Outdated
phys_size: physical size of the design pattern.
margin_size: physical size of borders that need to be discarded.
'''
if isinstance(phys_size, np.ndarray):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these checks still needed? You're constraining phys_size to be a tuple of floats at the top level now, correct?

Copy link
Collaborator Author

@mawc2019 mawc2019 Feb 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although the type hint indicates phys_size to be a tuple, I think some checks here are still needed if we want some flexibility in user input. I personally prefer to keeping such flexibility for users. One reason is that a user can easily input a phys_size that has a reasonable type but is slightly different from our expected one. Another reason is that type hints do not report errors. In addition, we need to peel away redundant dimensions and remove zeros in phys_size, which can be done more easily when phys_size is converted to an array.
That being said, the code here can be simplified, because the if and the first elif parts can be merged as one if statement.

@mawc2019
Copy link
Collaborator Author

In any case, this motivates some changes to my own ruler implementation, so let me work on that and I might have a proposal thereafter.

Can you explain the approach based on padding that you mentioned today, @mfschubert? I think trimming helps but padding does not. In recent versions, I remove padding options for users.

@mfschubert
Copy link
Contributor

mfschubert commented Feb 25, 2023

Yes, let me try. I believe there is a problematic case, which is very similar to the one you shared -- where a feature runs into the edge of the design region, with a very shallow angle. I created such a design x, the top left panel of the figure below.

In all cases I am using a diameter-40 circular brush.

Currently, the minimum feature size for solid is computed by comparing the binary opening of edge-padded x with x itself. I plotted the result of the opening, but kept the padded region in the image just for illustration purposes. One would actually only compare the central region denoted by the red line. One can see that there are no violations -- this is the bottom center image.

Turning now to the minimum feature size for void: this is computed by comparing the binary opening of ~x with ~x. The two figures on the right show the result. With the diameter-40 brush, it is of course not possible to fill in the corner of the void feature -- and increasingly so as the brush size increases. At some brush size, even interior pixels in the design may show up as violations with the brush.

image

My current belief is that we will need check for violations twice -- once with entirely solid padding, and once with entirely void padding. Then, if any pixel has no violations in even one of the two cases, then we would consider this as no violations.

@mawc2019
Copy link
Collaborator Author

mawc2019 commented Feb 27, 2023

Thanks for clarifying, @mfschubert!

My current belief is that we will need check for violations twice -- once with entirely solid padding, and once with entirely void padding. Then, if any pixel has no violations in even one of the two cases, then we would consider this as no violations.

I agree that this double-checking scheme can probably work well in many cases. The logic seems that, entirely solid padding does not affect the result of solid minimum length scale, if the design pattern has a well defined solid minimum length scale somewhere. Likewise, entirely void padding does not affect the result of void minimum length scale, if the design pattern has a well defined void minimum length scale somewhere.
Sometimes the minimum length scale may not be well defined. For example, in the case mentioned above, if the double-checking scheme is used, the calculated minimum length scale may be close to the length of the entire design pattern, which implies that the solid instead of the void region is responsible for the minimum length scale. However, we know that solid and void have the equal status, which means no one is more special than the other.

Overall, apart from the edge-mode padding, it seems a good idea to do entirely solid padding when we estimate solid minimum length scales, and do entirely void padding when we estimate void minimum length scales. However, when we want to estimate minimum length scales through open minus close, which is done by the function dual_minimum_length() in ruler.py, the edge-mode padding seems the only fair choice. As a consequence, in some cases, the result from dual_minimum_length() may be quite different from the minimum between the results of solid_minimum_length() and void_minimum_length().

declared_mls: float,
angle: float = 0,
center: Tuple[float, float] = (0, 0)) -> np.ndarray:
'''
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Complete the docstring for this and the disc() function below?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure!

@ianwilliamson
Copy link
Collaborator

@mawc2019 Nice work on the formatting / documentation! I think the code is much easier to follow now.

As minor general stylistic suggestion, I suggest using # ... style instead of """ ... """ for inline (non-docstring) comments.

@mawc2019
Copy link
Collaborator Author

As minor general stylistic suggestion, I suggest using # ... style instead of """ ... """ for inline (non-docstring) comments.

A non-docstring annotation may occupy several lines. Perhaps the # ... style is not the most suitable for this case? Shall we use something like ''' ... ''' instead of """ ... """ or # ...?

@ianwilliamson
Copy link
Collaborator

A non-docstring annotation may occupy several lines. Perhaps the # ... style is not the most suitable for this case?

It's a recommended approach: https://google.github.io/styleguide/pyguide.html#385-block-and-inline-comments

@mawc2019 mawc2019 force-pushed the change_default_and_format branch from 101f97b to 0df9cf8 Compare March 7, 2023 07:54
@stevengj
Copy link
Contributor

stevengj commented Mar 7, 2023

Conclusion: let's go with min(open-design, design-close), use opencv, use the new definition of interfacial pixels, and use the simplest-to-explain definition of the discretized "circular" kernel.

@oskooi
Copy link
Collaborator

oskooi commented Mar 10, 2023

One other feature relates to the padding of the edge pixels of the design region. Our convention will be:

  1. when measuring the minimum length of the solid regions, the edge padding will use solid pixels (as part of the open operation).
  2. when measuring the minimum length of the void regions, the edge padding will use void pixels (as part of the close operation).

@mawc2019
Copy link
Collaborator Author

mawc2019 commented Mar 12, 2023

A few more issues need to be finalized, but they do not influence the results of minimum length scales.

  1. Shall we remove the argument margin_size for 2d designs? It seems no longer useful for 2d designs. The initial purpose of this option was to allow users to disregard some regions near boundaries, so that the estimation of minimum length scales is unaffected by sharp corners formed with boundaries or extra structures due to edge-mode padding. However, the current default padding scheme satisfies the initial purpose of margin_size. As mentioned before, the current default padding scheme is 1) solid/void padding when estimating minimum length scales of solid/void regions and 2) solid/void padding for opening/closing in the opening-minus-closing approach. Although we also allow users to specify other padding schemes with which margin_size could make a difference, using such a non-default padding scheme and nonzero margin_size simultaneously seems self-contradictory.
  2. The argument margin_size is still useful for 1d designs. In this case, the code simply seeks the shortest segments composed of consecutive solid or void pixels in the input 1d array. No morphological functions are involved. The argument margin_size specifies the sizes of head and tail regions to be disregarded.
  3. Shall we add a user function dedicated to 1d designs? One reason is that the arguments margin_size and pad_mode are useful for 2d but useless for 1d the argument margin_size is useful for 1d but useless for 2d, while the argument pad_mode is useful for 2d but useless for 1d. Another reason is that, unlike 2d cases where solid and void minimum length scales can be computed separately, in 1d cases, the two minimum length scales are always obtained simultaneously after the 1d array is scanned once.
  4. There are four user functions in the current version, namely, solid_minimum_length, void_minimum_length, both_minimum_length, and dual_minimum_length. The name dual_minimum_length does not seem straightforward. Perhaps we should rename it as minimum_length.

@mawc2019 mawc2019 force-pushed the change_default_and_format branch from 0df9cf8 to 9a52603 Compare March 13, 2023 00:45
@mfschubert
Copy link
Contributor

To be clear, is the 1D codepath different than the 2D codepath? I would expect the 2D algorithm works for 1D designs, and if we do have a dedicated 1D codepath I would actually recommend eliminating it entirely. We don't need 1D length scales for the paper, and if I'm not mistaken the 2D algorithm will work for 1D designs (as long as you add a dummy dimension).

With this change, we can eliminate pad_mode and margin_size entirely. If there is a reason to keep the 1D codepath, I would change the behavior of the 1D algorithm to match that of 2D, i.e. to not require the pad_mode and margin_size parameters.

Regarding your question (5), best to name things descriptively so the user has an idea about what they are working with. For example:

  • minimum_lengths which computes both solid and void
  • minimum_length_solid
  • minimum_length_void
  • minimum_length_solid_or_void which computes the minimum of the two

@mawc2019
Copy link
Collaborator Author

To be clear, is the 1D codepath different than the 2D codepath?

Yes. Both 1d and 2d share the same user functions, but inside the user functions, 1d problems are solved in a different approach from 2d. No morphological operations are involved in the 1d approach.

if I'm not mistaken the 2D algorithm will work for 1D designs (as long as you add a dummy dimension).
If there is a reason to keep the 1D codepath, I would change the behavior of the 1D algorithm to match that of 2D

There are two reasons for using an dedicated 1d algorithm instead of making it 2d by adding a dummy dimension.

  1. Minimum length scales of any 1d pattern can be computed accurately if we use the algorithm dedicated to 1d. If we add a dummy dimension and make it 2d, we might get results with small errors. (As we know, the 2d algorithm are not completely accurate in many cases). Replacing a completely accurate algorithm with a less accurate one does not sound like a perfect choice.
  2. When adding a dummy dimension, the size of the dummy dimension needs to be large enough, or we may get incorrect results. The size of the dummy dimension typically should not less than the minimum length scale that we want to seek. (This is not a problem when we deal with small-scale 1d problems because we can simply make the dummy dimension large enough.)

and if we do have a dedicated 1D codepath I would actually recommend eliminating it entirely. We don't need 1D length scales for the paper

It does not affect any results in the paper and we do not need to mention it in the paper. But for the completeness of a program for computing minimum length scales, I suggest we keep it. By the way, we were also reluctant to eliminate the 3d code (as mentioned in the first point in @stevengj 's comment here). We finally eliminate it not only because the paper does not need it, but also because we have to do so in order to use OpenCV. So the 1d code does not enforce us to remove it as the 3d code did.

With this change, we can eliminate pad_mode and margin_size entirely.

Sorry, there was a mistake in my previous reply. It was corrected. It should be "the argument margin_size is useful for 1d but useless for 2d, while the argument pad_mode is useful for 2d but useless for 1d."

Regarding your question (5), best to name things descriptively so the user has an idea about what they are working with. For example:
minimum_lengths which computes both solid and void
minimum_length_solid
minimum_length_void
minimum_length_solid_or_void which computes the minimum of the two

Thanks for the suggestion. The first three sounds good to me. I will change the names correspondingly. But for the last one, I still prefer minimum_length to minimum_length_solid_or_void.

@mfschubert
Copy link
Contributor

I would be surprised if the 2D algorithm were inaccurate for 1D designs. In my own ruler I used 1D designs in some test cases. Are you sure this is the case? And, regarding the dummy dimension, all I meant was

x_with_dummy_dim = x[:, np.newaxis]

I don't think we actually need the new dimension to have >1 dimension for the code to work.

Finally, regarding naming convention, it's your call but I wouldn't have two functions with the names minimum_lengths and minimum_length. These are easy to get confused.

@mawc2019
Copy link
Collaborator Author

mawc2019 commented Mar 14, 2023

And, regarding the dummy dimension, all I meant was x_with_dummy_dim = x[:, np.newaxis]

An example is as follows.
image
If we do not disregard any regions, the declared minimum length scales of solid and void regions are 4 and 3. If the outermost regions are disregarded, the declared minimum length scales of solid and void regions are 10 and 21. But when I used the syntax x_with_dummy_dim = x[:, np.newaxis] and then applied metrics, the results were 60 and 101. Perhaps I did not use metrics correctly?

import numpy as np
from matplotlib import pyplot as plt
import metrics

pattern_1d = np.sin(np.linspace(0,100,101)/5) > 0.5
plt.plot(pattern_1d)

pattern_2d = pattern_1d[:,np.newaxis]
print(metrics.minimum_length_scale(pattern_2d))

pattern_2d = pattern_1d[np.newaxis,:]
print(metrics.minimum_length_scale(pattern_2d))

I would be surprised if the 2D algorithm were inaccurate for 1D designs. In my own ruler I used 1D designs in some test cases. Are you sure this is the case?

In ruler.py, the size of a kernel is allowed to be a float. But for 1d, the declared minimum length scales should always be integers. That is one of the sources of inaccuracy.
In metrics.py, the code disregards the outermost regions by default. So let us not consider those regions. Should we always use IgnoreScheme.NONE in minimum_length_scale for 1d?

@mfschubert
Copy link
Contributor

Thanks for this example. I guess this is a corner case I hadn't considered sufficiently. It is definitely unintended behavior. In any case, I am ok with the 1D codepath, as it seems that there is some more work to merge it with the 2D case.

@mawc2019
Copy link
Collaborator Author

mawc2019 commented Mar 14, 2023

Conclusion: let's go with min(open-design, design-close) ……

This is just what the current version does in minimum_length(), although the way we obtain this quantity is not to calculate solid and void quantities separately. We use the interior solid pixels in (open-design) or (design-close) = close-open as the indicator for the binary search.

…nd void minimum lengths separately and taking the minimum
@mawc2019
Copy link
Collaborator Author

Now the algorithm of the function minimum_length is computing minimum lengths of solid and void regions separately and taking the minimum. The function that performs open minus close is renamed as minimum_length_min.

@stevengj stevengj merged commit 42842e6 into main Mar 16, 2023
@stevengj stevengj deleted the change_default_and_format branch March 16, 2023 20:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants