Results 1 to 2 of 2

Thread: How to remove contents between all occurance of certain html tag?

Hybrid View

Previous Post Previous Post   Next Post Next Post
  1. #1
    Join Date
    Apr 2020
    Posts
    7
    Thanks
    2

    Default How to remove contents between all occurance of certain html tag?

    Hi,

    I am trying to remove content between <div>content</div> using QRegularExpression. When the tag occurs once, it works fine. However, when it occurs more than once, it fails.

    For example

    QString str( "This is a <div style='color:red;'>test</div> to remove all <div>divs</div> from html string" );

    I want to remove all content between "div" tag to produce following new string

    "This is a to remove all from html string".

    QRegularExpression rx( "<(div)[^>]*>.*</div>", QRegularExpression::CaseInsensitiveOption ) produce new string below

    "This is a from html string"

    I can not use xml parser as the html may not be well formed.

    Many thanks.

  2. #2
    Join Date
    Apr 2020
    Posts
    7
    Thanks
    2

    Default Re: How to remove contents between all occurance of certain html tag?

    Figure out myself, which also works when a tag includes another tag. Need to run multiple passes if you want to remove content in diferent tags in one go.

    Qt Code:
    1. void removeAllBetweenHtmlTag( QString& content, const QStringList& tagList )
    2. {
    3. QString tagStr = QString( "(<\\b(%1)\\b[^>]*>[^<]+</\\b(%1)\\b>)" ).
    4. arg( tagList.join( "|" ) );
    5.  
    6. QRegularExpression rx( tagStr, QRegularExpression::CaseInsensitiveOption );
    7.  
    8. while( true ) {
    9. int len1 = content.length();
    10. content.replace( rx, " " );
    11. qDebug() << content;
    12. int len2 = content.length();
    13. if ( len1 == len2 ) {
    14. break;
    15. }
    16. }
    17.  
    18. }
    19.  
    20. int main( int, char** argv )
    21. {
    22. qDebug() << "";
    23.  
    24. QString str1( "This is a <div style='color:red;'>test</div> to <span>remove</span> all <div>divs containing <span>spans</span></div> from html string" );
    25. qDebug() << str1;
    26. removeAllBetweenHtmlTag( str1, QStringList() << "div" << "span" );
    27. qDebug() << str1;
    28.  
    29. qDebug() << "";
    30. }
    To copy to clipboard, switch view to plain text mode 

Similar Threads

  1. strange delete occurance
    By Qiieha in forum General Programming
    Replies: 3
    Last Post: 31st May 2012, 09:50
  2. Snapshot of HTML page with flash contents
    By GoGetIt in forum Qt Programming
    Replies: 0
    Last Post: 23rd August 2010, 13:29
  3. Replies: 0
    Last Post: 13th August 2010, 12:17
  4. Occurance of Duplicate items in QComboBox
    By merry in forum Qt Programming
    Replies: 8
    Last Post: 12th September 2007, 15:05
  5. QRegExp Help; remove all html tag
    By patrik08 in forum Qt Programming
    Replies: 7
    Last Post: 27th July 2006, 13:40

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Digia, Qt and their respective logos are trademarks of Digia Plc in Finland and/or other countries worldwide.