You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/src/rosalind/09-subs.md
+38-6Lines changed: 38 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -36,13 +36,14 @@
36
36
`2 4 10`
37
37
38
38
### Handwritten solution
39
-
The clunkiest solution uses a for-loop.
39
+
Let's start off with the most verbose solution.
40
40
We can loop over every character within the input string and
41
41
check if we can find the substring in the subsequent characters.
42
42
43
+
In the first solution,
44
+
we will check each index for an exact match to the substring we are searching for.
43
45
44
46
```julia
45
-
46
47
dataset ="GATATATGCATATACTTATAT"
47
48
search_string ="ATAT"
48
49
@@ -58,11 +59,11 @@ function haystack(substring, string)
58
59
end
59
60
60
61
output = []
61
-
62
62
for i ineachindex(string)
63
63
# check if first letter of string matches character at the index
64
64
if string[i] == substring[1]
65
-
# check if full
65
+
# check if full substring matches at index
66
+
# make sure not to search index past string
66
67
if i +length(substring) -1<=length(string) && string[i:i+length(substring)-1] == substring
67
68
push!(output, i)
68
69
end
@@ -73,7 +74,17 @@ end
73
74
74
75
haystack(search_string, dataset)
75
76
```
76
-
We can also use the [`findnext`](https://docs.julialang.org/en/v1/base/strings/#Base.findnext) function in Julia so that we don't have to loop through every character in the string.
77
+
We can also use the [`findnext`](https://docs.julialang.org/en/v1/base/strings/#Base.findnext) function in Julia.
78
+
79
+
There are similar `findfirst` and `findlast` functions,
80
+
but since we want to find all matches,
81
+
we will use `findnext`.
82
+
83
+
Currently, there isn't a `findall` function that allows us to avoid a loop.
84
+
We'll still also loop over every character in the string,
85
+
as there could be overlapping substrings.
86
+
87
+
77
88
78
89
```julia
79
90
functionhaystack_findnext(substring, string)
@@ -105,6 +116,25 @@ function haystack_findnext(substring, string)
105
116
end
106
117
107
118
119
+
haystack_findnext(search_string, dataset)
120
+
```
121
+
122
+
Lastly, we can also use Regex's search function,
123
+
which produces quite the elegant solution!
124
+
125
+
126
+
```julia
127
+
functionhaystack_regex(substring, string)
128
+
ifisempty(substring) ||isempty(string)
129
+
throw(ErrorException("emptysequences"))
130
+
end
131
+
if!occursin(substring, string)
132
+
return[]
133
+
end
134
+
135
+
return [m.offset for m ineachmatch(Regex(substring), string, overlap=true) ]
136
+
end
137
+
108
138
haystack_findnext(search_string, dataset)
109
139
```
110
140
@@ -115,4 +145,6 @@ Lastly, we can leverage some functions in the Kmers Biojulia package to help us!
0 commit comments