-
Notifications
You must be signed in to change notification settings - Fork 291
Differential Binarization model #2095
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
mehtamansi29
wants to merge
91
commits into
keras-team:master
Choose a base branch
from
mehtamansi29:diffbin
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 9 commits
Commits
Show all changes
91 commits
Select commit
Hold shift + click to select a range
ed97271
ImageText detector preprocessor for Differential Binarization model
mehtamansi29 d97f362
db_utils functions and testfile
mehtamansi29 de3aaae
Diffbin utils function and test file
mehtamansi29 9a3cf2a
diffbin utils function and testfile
mehtamansi29 93ad1ba
diffbin preprocessing function
mehtamansi29 7268535
diffbin postprocessing function
mehtamansi29 f1c3734
diffbin postprocessing function_1
mehtamansi29 d3c74c9
diffbin postprocessing function_2
mehtamansi29 aafef9e
diffbin postprocessing function_3
mehtamansi29 352a089
Merge branch 'keras-team:master' into diffbin
mehtamansi29 d94a2e6
diffbin preocessing and db_utils completed
mehtamansi29 0028b90
Merge branch 'keras-team:master' into diffbin
mehtamansi29 d4724d9
diffbin_backbone model creation and backboone test for diffbin segmen…
mehtamansi29 3c75f47
Merge branch 'keras-team:master' into diffbin
mehtamansi29 d41dc34
modifited diffbin _textdetector
mehtamansi29 4b602c4
Updates image_text_detector preprocessor
mehtamansi29 ee2dced
Updates image_text_detector preprocessor with ignores argument
mehtamansi29 736b0c9
Updates image_text_detector preprocessor,db_utils and formatting with…
mehtamansi29 fcfed6a
Updates image_text_detector_1
mehtamansi29 98e2fbc
Updates image_text_detector_1
mehtamansi29 5fcaefc
Updates image_text_detector_3
mehtamansi29 19c4e79
Updates image_text_detector_3
mehtamansi29 a5516dc
Updates image_text_detector_4
mehtamansi29 b46db73
Updates image_text_detector_5
mehtamansi29 8c42e56
Updates image_text_detector_6
mehtamansi29 6b528a2
Updates image_text_detector_7
mehtamansi29 df67b6c
annotation size
mehtamansi29 34cc866
Merge branch 'keras-team:master' into diffbin
mehtamansi29 876f1af
fill poly keras chages
mehtamansi29 9a4a3d6
fill poly keras changes revert
mehtamansi29 4bbbbb8
diffbin_imagetextdetector import changes
mehtamansi29 5acaaca
diffbin_imagetextdetector changes
mehtamansi29 9b7d7c4
diffbin_imagetextdetector and precommit changes
mehtamansi29 38eab50
diffbin_imagetextdetector and precommit changes
mehtamansi29 9865bc0
diffbin_textdetector_1
mehtamansi29 e57d280
diffbin_textdetector_2
mehtamansi29 1e85236
diffbin_textdetector_3
mehtamansi29 5007488
diffbin_textdetector_4
mehtamansi29 55dd899
Merge branch 'keras-team:master' into diffbin
mehtamansi29 39ae6c3
diffbin_backbon_image_shape
mehtamansi29 dafbaac
diffbin_backbon_image_preprocessor
mehtamansi29 bacee3a
diffbin textdetector update
mehtamansi29 0bf423e
diffbin textdetector update_1
mehtamansi29 623f6c6
new loss function added
mehtamansi29 c103746
loss updated in diffbin_textdetector
mehtamansi29 a5118ae
loss updated in diffbin_textdetector_1
mehtamansi29 3db380f
loss updated in diffbin_textdetector_2
mehtamansi29 7520308
diffbi_text_detector_update_1
mehtamansi29 5b7e11a
Update DB loss function
mehtamansi29 885c8e2
Update DB loss function_1
mehtamansi29 be8e5f3
Update DB loss function_2
mehtamansi29 f77a4f0
Update DB loss function_3
mehtamansi29 20d9ed3
Update DB loss function_4
mehtamansi29 872fe4d
Update DB loss function_5
mehtamansi29 e30d41e
Update DB loss function_6
mehtamansi29 5e52568
Update DB loss function_7
mehtamansi29 0a797a9
Update DB loss function_8
mehtamansi29 9e01b53
Update DB loss function_9
mehtamansi29 c64af71
Update DB loss function_8
mehtamansi29 78ce606
Update diffbin loss function and test file for loss function_1
mehtamansi29 ff56956
Update diffbin loss function_1
mehtamansi29 22dc3bf
Update diffbin loss function_2
mehtamansi29 6c666a2
Update diffbin loss function_3
mehtamansi29 6a823d3
Update diffbin loss function_4
mehtamansi29 80b3c56
Gemini Suggested changes
mehtamansi29 020d629
Resolved conflict
mehtamansi29 bb76db9
Fix PaliGemmaCausalLM example. (#2302)
hertschuh e53aeb0
Routine HF sync (#2303)
divyashreepathihalli 025371f
incorrect condition on self.sliding_window_size (#2289)
laxmareddyp d22c615
Bump the python group with 2 updates (#2282)
dependabot[bot] 2b21c6c
Modify TransformerEncoder masking documentation (#2297)
sonali-kumari1 94b40e5
Fix Gemma3InterleaveEmbeddings JAX inference error by ensuring indice…
pctablet505 b3d18cc
update preset versions (#2307)
laxmareddyp 4c5bcfb
Fix Mistral conversion script (#2306)
laxmareddyp e39b128
Bump the python group with 6 updates (#2317)
dependabot[bot] c3deb47
Qwen3 causal lm (#2311)
kanpuriyanawab c91ca35
Update JAX GPU version (#2319)
sachinprasadhs c99e86d
support flash-attn at torch backend (#2257)
pass-lin 02f3561
Add HGNetV2 to KerasHub (#2293)
harshaljanjani 98df372
Qwen3 presets register (#2325)
laxmareddyp 3eeba26
diffbin_imagetextdetector and precommit changes
mehtamansi29 445f537
Update diffbin loss function and test file for loss function_1
mehtamansi29 afd6251
Resolved conflict
mehtamansi29 848abd0
Revert "Gemini Suggested changes"
mehtamansi29 5820ccc
Resolving Conflicts_1
mehtamansi29 5546ac8
resolving conflict___1
mehtamansi29 f099d84
resolving conflict___2
mehtamansi29 9593170
resolving conflict___3
mehtamansi29 fbb0eed
Merge remote-tracking branch 'upstream/master' into diffbin
mehtamansi29 8d9baf4
Merge branch 'keras-team:master' into diffbin
mehtamansi29 56e4e50
Resolve backend failing testcases
mehtamansi29 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,311 @@ | ||
import keras | ||
|
||
|
||
class Point: | ||
def __init__(self, x, y): | ||
self.x = x | ||
self.y = y | ||
|
||
def __add__(self, other): | ||
return Point(self.x + other.x, self.y + other.y) | ||
|
||
def __sub__(self, other): | ||
return Point(self.x - other.x, self.y - other.y) | ||
|
||
def __neg__(self): | ||
return Point(-self.x, -self.y) | ||
|
||
def cross(self, other): | ||
return self.x * other.y - self.y * other.x | ||
|
||
def to_tuple(self): | ||
return (self.x, self.y) | ||
|
||
|
||
def shrink_polygan(polygon, offset): | ||
sachinprasadhs marked this conversation as resolved.
Show resolved
Hide resolved
|
||
""" | ||
Shrinks a polygon inward by moving each point toward the center. | ||
sachinprasadhs marked this conversation as resolved.
Show resolved
Hide resolved
|
||
""" | ||
if len(polygon) < 3: | ||
return polygon | ||
|
||
if not isinstance(polygon[0], Point): | ||
polygon = [Point(p[0], p[1]) for p in polygon] | ||
|
||
cx = sum(p.x for p in polygon) / len(polygon) | ||
cy = sum(p.y for p in polygon) / len(polygon) | ||
|
||
shrunk = [] | ||
for p in polygon: | ||
dx = p.x - cx | ||
dy = p.y - cy | ||
norm = max((dx**2 + dy**2) ** 0.5, 1e-6) | ||
shrink_ratio = max(0, 1 - offset / norm) | ||
shrunk.append(Point(cx + dx * shrink_ratio, cy + dy * shrink_ratio)) | ||
|
||
return shrunk | ||
|
||
|
||
# Polygon Area | ||
def Polygon(coords): | ||
""" | ||
Calculate the area of a polygon using the Shoelace formula. | ||
""" | ||
coords = keras.ops.convert_to_tensor(coords, dtype="float32") | ||
x = coords[:, 0] | ||
y = coords[:, 1] | ||
|
||
x_next = keras.ops.roll(x, shift=-1, axis=0) | ||
y_next = keras.ops.roll(y, shift=-1, axis=0) | ||
|
||
area = 0.5 * keras.ops.abs(keras.ops.sum(x * y_next - x_next * y)) | ||
return area | ||
|
||
|
||
# binary search smallest width | ||
def binary_search_smallest_width(poly): | ||
""" | ||
The function aims maximum amount by which polygan can be shrunk by | ||
taking polygan's smallest width | ||
""" | ||
if len(poly) < 3: | ||
return 0 | ||
|
||
low, high = 0, 1 | ||
|
||
while high - low > 0.01: | ||
mid = (high + low) / 2 | ||
mid_poly = shrink_polygan(poly, mid) | ||
mid_poly = keras.ops.cast( | ||
keras.ops.stack([[p.x, p.y] for p in mid_poly]), dtype="float32" | ||
) | ||
area = Polygon(mid_poly) | ||
|
||
if area > 0.1: | ||
low = mid | ||
else: | ||
high = mid | ||
|
||
height = (low + high) / 2 | ||
height = (low + high) / 2 | ||
sachinprasadhs marked this conversation as resolved.
Show resolved
Hide resolved
|
||
return int(height) if height >= 0.1 else 0 | ||
|
||
|
||
# project point to line | ||
sachinprasadhs marked this conversation as resolved.
Show resolved
Hide resolved
|
||
def project_point_to_line(x, u, v, axis=0): | ||
""" | ||
Projects a point x onto the line defined by points u and v | ||
""" | ||
x = keras.ops.convert_to_tensor(x, dtype="float32") | ||
u = keras.ops.convert_to_tensor(u, dtype="float32") | ||
v = keras.ops.convert_to_tensor(v, dtype="float32") | ||
|
||
n = v - u | ||
n = n / ( | ||
keras.ops.norm(n, axis=axis, keepdims=True) + keras.backend.epsilon() | ||
) | ||
p = u + n * keras.ops.sum((x - u) * n, axis=axis, keepdims=True) | ||
return p | ||
|
||
|
||
# project_point_to_segment | ||
def project_point_to_segment(x, u, v, axis=0): | ||
""" | ||
Projects a point x onto the line segment defined by points u and v | ||
""" | ||
p = project_point_to_line(x, u, v, axis=axis) | ||
outer = keras.ops.greater_equal( | ||
keras.ops.sum((u - p) * (v - p), axis=axis, keepdims=True), 0 | ||
) | ||
near_u = keras.ops.less_equal( | ||
keras.ops.norm(u - p, axis=axis, keepdims=True), | ||
keras.ops.norm(v - p, axis=axis, keepdims=True), | ||
) | ||
o = keras.ops.where(outer, keras.ops.where(near_u, u, v), p) | ||
return o | ||
|
||
|
||
# get line of height | ||
def get_line_height(poly): | ||
return binary_search_smallest_width(poly) | ||
sachinprasadhs marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
|
||
# cv2.fillpoly function with keras.ops | ||
def fill_poly_keras(vertices, image_shape): | ||
sachinprasadhs marked this conversation as resolved.
Show resolved
Hide resolved
|
||
""" | ||
Fill a polygon using the cv2.fillPoly function with keras.ops. | ||
Ray-casting algorithm to determine if a point is inside a polygon. | ||
""" | ||
height, width = image_shape | ||
x = keras.ops.arange(width) | ||
y = keras.ops.arange(height) | ||
xx, yy = keras.ops.meshgrid(x, y) | ||
xx = keras.ops.cast(xx, "float32") | ||
yy = keras.ops.cast(yy, "float32") | ||
|
||
result = keras.ops.zeros((height, width), dtype="float32") | ||
|
||
vertices = keras.ops.convert_to_tensor(vertices, dtype="float32") | ||
num_vertices = vertices.shape[0] | ||
|
||
for i in range(num_vertices): | ||
x1, y1 = vertices[i] | ||
x2, y2 = vertices[(i + 1) % num_vertices] | ||
|
||
# Modified conditions to potentially include more boundary pixels | ||
cond1 = (yy > keras.ops.minimum(y1, y2)) & ( | ||
yy <= keras.ops.maximum(y1, y2) | ||
) | ||
cond2 = xx < (x1 + (yy - y1) * (x2 - x1) / (y2 - y1)) | ||
|
||
result = keras.ops.where( | ||
cond1 & cond2 & ((y1 > yy) != (y2 > yy)), 1 - result, result | ||
) | ||
|
||
result = keras.ops.cast(result, "int32") | ||
return result | ||
|
||
|
||
# get mask | ||
def get_mask(w, h, polys, ignores): | ||
""" | ||
Generates a binary mask where: | ||
- Ignored regions are set to 0 | ||
- Text regions are set to 1 | ||
""" | ||
mask = keras.ops.ones((h, w), dtype="float32") | ||
|
||
for poly, ignore in zip(polys, ignores): | ||
poly = keras.ops.cast(keras.ops.convert_to_numpy(poly), dtype="int32") | ||
|
||
if poly.shape[0] < 3: | ||
print("Skipping invalid polygon:", poly) | ||
continue | ||
|
||
fill_value = 0.0 if ignore else 1.0 | ||
poly_mask = fill_poly_keras(poly, (h, w)) | ||
|
||
if ignore: | ||
mask = keras.ops.where( | ||
keras.ops.cast(poly_mask, "float32") == 1.0, | ||
keras.ops.zeros_like(mask), | ||
mask, | ||
) | ||
else: | ||
mask = keras.ops.maximum(mask, poly_mask) | ||
return mask | ||
|
||
|
||
# get polygan coordinates projection | ||
def get_coords_poly_projection(coords, poly): | ||
""" | ||
This projects set of points onto edges of a polygan and return closest | ||
projected points | ||
""" | ||
start_points = keras.ops.array(poly, dtype="float32") | ||
end_points = keras.ops.concatenate( | ||
[ | ||
keras.ops.array(poly[1:], dtype="float32"), | ||
keras.ops.array(poly[:1], dtype="float32"), | ||
], | ||
axis=0, | ||
) | ||
region_points = keras.ops.array(coords, dtype="float32") | ||
|
||
projected_points = project_point_to_segment( | ||
keras.ops.expand_dims(region_points, axis=1), | ||
keras.ops.expand_dims(start_points, axis=0), | ||
keras.ops.expand_dims(end_points, axis=0), | ||
axis=2, | ||
) | ||
|
||
projection_distances = keras.ops.norm( | ||
keras.ops.expand_dims(region_points, axis=1) - projected_points, axis=2 | ||
) | ||
|
||
indices = keras.ops.expand_dims( | ||
keras.ops.argmin(projection_distances, axis=1), axis=-1 | ||
) | ||
best_projected_points = keras.ops.take_along_axis( | ||
projected_points, indices[..., None], axis=1 | ||
)[:, 0, :] | ||
|
||
return best_projected_points | ||
|
||
|
||
# get polygan coordinates distance | ||
def get_coords_poly_distance(coords, poly): | ||
""" | ||
This function calculates distance between set of points and polygan | ||
""" | ||
projection = get_coords_poly_projection(coords, poly) | ||
return keras.ops.linalg.norm(projection - coords, axis=1) | ||
|
||
|
||
# get normalized weight | ||
def get_normalized_weight(heatmap, mask, background_weight=3.0): | ||
""" | ||
This function calculates normalized weight of heatmap | ||
""" | ||
pos = keras.ops.greater_equal(heatmap, 0.5) | ||
neg = keras.ops.ones_like(pos, dtype="float32") - keras.ops.cast( | ||
pos, dtype="float32" | ||
) | ||
pos = keras.ops.logical_and(pos, mask) | ||
neg = keras.ops.logical_and(neg, mask) | ||
npos = keras.ops.sum(pos) | ||
nneg = keras.ops.sum(neg) | ||
smooth = ( | ||
keras.ops.cast(npos, dtype="float32") | ||
+ keras.ops.cast(nneg, dtype="float32") | ||
+ 1 | ||
) * 0.05 | ||
wpos = (keras.ops.cast(nneg, dtype="float32") + smooth) / ( | ||
keras.ops.cast(npos, dtype="float32") + smooth | ||
) | ||
weight = keras.ops.zeros_like(heatmap) | ||
neg = keras.ops.cast(neg, "bool") | ||
weight = keras.ops.where(neg, background_weight, weight) | ||
pos = keras.ops.cast(pos, "bool") | ||
weight = keras.ops.where(pos, wpos, weight) | ||
return weight | ||
|
||
|
||
# Getting region coordinates | ||
def get_region_coordinate(w, h, poly, heights, shrink): | ||
""" | ||
Extract coordinates of regions corresponding to text lines in an image. | ||
""" | ||
label_map = keras.ops.zeros((h, w), dtype="float32") | ||
for line_id, (p, height) in enumerate(zip(poly, heights)): | ||
if height > 0: | ||
poly_points = [Point(row[0], row[1]) for row in p] | ||
shrinked_poly = shrink_polygan(poly_points, height * shrink) | ||
shrunk_poly_tuples = [point.to_tuple() for point in shrinked_poly] | ||
shrunk_poly_tensor = keras.ops.convert_to_tensor( | ||
shrunk_poly_tuples, dtype="float32" | ||
) | ||
filled_polygon = fill_poly_keras(shrunk_poly_tensor, (h, w)) | ||
label_map = keras.ops.maximum(label_map, filled_polygon) | ||
|
||
label_map = keras.ops.convert_to_tensor(label_map) | ||
sorted_tensor = keras.ops.sort(keras.ops.reshape(label_map, (-1,))) | ||
diff = keras.ops.concatenate( | ||
[ | ||
keras.ops.convert_to_tensor([True]), | ||
(sorted_tensor[1:] != sorted_tensor[:-1]), | ||
] | ||
) | ||
diff = keras.ops.reshape(diff, (-1,)) | ||
indices = keras.ops.convert_to_tensor(keras.ops.where(diff)) | ||
indices = keras.ops.reshape(indices, (-1,)) | ||
unique_labels = keras.ops.take(sorted_tensor, indices) | ||
unique_labels = unique_labels[unique_labels != 0] | ||
regions_coords = [] | ||
for label in unique_labels: | ||
mask = keras.ops.equal(label_map, label) | ||
y, x = keras.ops.nonzero(mask) | ||
coords = keras.ops.stack([x, y], axis=-1) | ||
regions_coords.append(keras.ops.convert_to_numpy(coords)) | ||
|
||
return regions_coords |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.