DB Replacer

Regular Expression to delete a H2 title within the text-body

6 months 1 week ago #77809

Stefan Schoch's Avatar Stefan Schoch

Hi there,

after migration about 2.000 articles from our old Typo3 site into our new Joomla site we now have hundreds of articles where the article begins with a H2 Text that is the same as the 'title' of the article.
Example:
www.schochs2016.alfahosting.org/integral...des-bewusstseins2805
s.nimbus.everhelper.me/share/1441546/wc6dgxlvkcplxby2lhvi
(the image preceeding the H2 will be deleted with DBreplacer shortly)

We really don't want to edit hundreds of articles to delete the H2 manually.

So my question is:
How can I use DBraplacer and search within the content to find a H2 which is identical with the _title_ of the very article (and then delete it)?

Thx,

Stefan

6 months 1 week ago #77814

Peter van Westen's Avatar Peter van Westen Admin

You cannot use values from other fields in the search in DB Replacer.
So is it enough to just remove all <h2> tags that are at the very start of all articles?

6 months 1 week ago #77820

Stefan Schoch's Avatar Stefan Schoch

That's not the best solution, but I'll give it a try.
I'd have to make sure that _only_ H2 that is at the very beginning of the article will be removed.

Got a hint how a regular expression would look like?

6 months 1 week ago #77825

Peter van Westen's Avatar Peter van Westen Admin

You can try this (with Regular Expressions on):

^\s*<h2.*?</h2>

6 months 1 week ago #77827

Stefan Schoch's Avatar Stefan Schoch

Thx, I'll gibe it a try!

6 months 1 week ago #77870

Stefan Schoch's Avatar Stefan Schoch

Great!

What if I don't want to _remove_ the H2 Headling, but just make it a H5?

So the above regex finds the first (and only the forst?) H2 in the article.
How would the replace string look like to convert i.e.:
<h2>Zustände und Stufen/Strukturen</h2>
into :
<h5>Zustände und Stufen/Strukturen</h5>
(but do so only if the H2 is the very first line in the article, ignore all other occurences of H2 in the article)

Thx!

Edit:
I have tried (succesfully) in regex101 this search-expression:

^(<h2>)(.*)(<\/h2>)
and this replace:
<h5>\2<\/h5>

When I try this with DBreplacer, noto just the first headline is deleted.
What's wrong?
s.nimbus.everhelper.me/share/1447442/72f6v5uqfwqpwb3b29ry

6 months 1 week ago #77877

Peter van Westen's Avatar Peter van Westen Admin

You need to make the match-all search in the regex non-gready.
So (.*?) instead of (.*)

Try:

^<h2([^>]*)>(.*?)<\/h2>
<h5\1>\2</h5>

See: www.regular-expressions.info/reference.html