Skip to content

Conversation

Wout4
Copy link
Collaborator

@Wout4 Wout4 commented Feb 21, 2025

avoids looping over the entire table, that consists of constants anyway

Copy link
Collaborator

@IgnaceBleukx IgnaceBleukx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add an assertion/typecheck in the constructor of the Table-constraints such that we are 100% certain the has_subexpr valid.
I think that if a solver does not support Table, and hence it's decomposed, this might go wrong currently (the decomposition also works with variables in the table, solver-API's typically don't).

# specialisation to avoid recursing over big tables
def has_subexpr(self):
if not hasattr(self, '_has_subexpr'): # if _has_subexpr has not been computed before or has been reset
arr, tab = self.args # the table 'tab' can only hold constants, never a nested expression
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but we should assert this also in the constructor of Table then.

@Wout4
Copy link
Collaborator Author

Wout4 commented Feb 26, 2025

I'm hesitant to put an assert that loops over the entire table, because that is kind of what we are trying to avoid doing with this optimisation. Maybe it should only be a competition thing where we know the input structure for sure

@Wout4 Wout4 mentioned this pull request Feb 28, 2025
@IgnaceBleukx
Copy link
Collaborator

IgnaceBleukx commented Apr 7, 2025

If I remember correctly, @ThomSerg found a nice (and efficient) way of checking all elements in a Table whether they are the correct type? Similar to what is currently in OR-Tools' constructor

@ThomSerg
Copy link
Collaborator

ThomSerg commented Apr 7, 2025

For the XCSP3 competition we didn't really do a 'more efficient check', we just skipped some. We added the attribute .vars to all expressions, similar to .args, which would return only the decision variables of that expression (no constants). This has a big impact for large tables. Since we know that the entries of the table will always be constants, we only had to do a check on the remaining args.

array = cp.intvar(...)
table = np.array(...) # 2D array of constants
cp.Table(array, table)

So .vars would only need to check array, which is a big saving since the table array can be quite large.
This implementation might be too tailored to the competition / table constraint, but some generic way of accessing subsets of the variables (where applicable like for table) would be nice.

@ThomSerg
Copy link
Collaborator

ThomSerg commented Apr 7, 2025

class Table(GlobalConstraint):
    ...
    def __init__(self, array, table):
        array = flatlist(array)
        if not all(isinstance(x, Expression) for x in array):
            raise TypeError("the first argument of a Table constraint should only contain variables/expressions")
        super().__init__("table", [array, table])

    ...

    @property
    def vars(self):
        return self._args[0]
    
    # specialisation to avoid recursing over big tables
    def has_subexpr(self):
        if not hasattr(self, '_has_subexpr'): # if _has_subexpr has not been computed before or has been reset
            # 2 equivalent options:
            arr, tab = self.args # the table 'tab' can only hold constants, never a nested expression
            arr = self.vars
            # ---
            self._has_subexpr = any(a.has_subexpr() for a in arr)
        return self._has_subexpr

It seems like we completely skipped checking the types of the 'table' part.

@tias tias added this to the v0.9.25 milestone Apr 7, 2025
@IgnaceBleukx
Copy link
Collaborator

Hmm ok not sure how to continue then... Should we indeed add this type-check? And if so, what would be the most efficient way of doing so?
It seems weird to me of not testing it to be honest... As we do quite some type-checks in all our global constraints.

@tias
Copy link
Collaborator

tias commented Apr 8, 2025

Right, so for an XCSP3 input we can do that, because its standard ensures it.

For generic CPMpy we cant.

However, we do assume that the second argument is a 2D list of lists right? This we can check... (the dims), and then, we can still overvwrite has_subexpr() whose default implementation is recursive, to a double for:
for row in arr: for el in row: if isinstance(el, Expression): self._has_subexpr=True; break?

Note that there is another shortcut: if 'arr' is an ndarray and 'dtype' is not 'object', then we know its constant.

However, to implement and see if it makes a difference in runtime, e.g. is worth the extra code, we would need a testcase just for ourself, could even be a random generated one...

@tias
Copy link
Collaborator

tias commented Apr 9, 2025

python3 -c "import timeit, numpy as np; from cpmpy import Table, intvar; n=20000; m=200; t=[[i*m+j for j in range(m)] for i in range(n)]; v=intvar(0, n, shape=m); print('Time:', timeit.timeit(lambda: Table(v, t).has_subexpr(), number=10)/10)"

is 0.7s on my machine.

but so the current PR implementation is not acceptable (though hyperfast as it ignores the arr); might try my suggestions tonight...

@IgnaceBleukx
Copy link
Collaborator

I would actually be in favor of doing an elaborate type-check when constructing the table constraint in the first place.
I.e., double for-loop over the nested list and checking is_num on all of the elements; or doing more fancy stuff such as exploiting the numpy-type.

But the bottom line being: we don't want to check this in the has_sub_expr call (which might happen many times during the transformation stack), but just once at construction time. For me, Tables with variables don't make any sense anyway...
If we want to support such things, we need to go a lot further (e.g., checking in the solver interface if table has some expressions, and then decomposing...) but that's off-topic for this PR

@tias tias marked this pull request as draft April 9, 2025 21:31
@tias
Copy link
Collaborator

tias commented Apr 9, 2025

Converting to draft.

So I implemented the is-const check on the table in the constructor, which means we don't have to check it in has_subexpr(), all good.

I also added the special case for when table is an np.array.

Now, the np.array case is much easier to implement, and also a stronger check: it ensures that the dimensions actually match, that it is a square table... it is also much faster. It is in fact so much faster that it is more efficient to create an ndarray and assert those properties, then to check with isinstance...

Now it goes further. If we would store the table as an np.array, then we don't need the .tolist and then it is even much much faster...

So now a further refactor presents itself: do we always store the 'table' part (for Table, NegTable, ShortTable) as an ndarray?

Advantages: should also be much faster on XCSP3 side: just read it in as ndarray and keep it that way

DIsadvantages: solvers probably expect a list of lists, so we will still need to do a .list() for those solvers I guess... decompositions will stay the same.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants